ABSTRACT 


CHANG,  CHIH-CHIEH  GEOFF.  Secure  Localization  and  Tracking  in  Sensor  Networks. 
(Under  the  direction  of  Professor  Wesley  E.  Snyder.) 

Localization  and  tracking  of  objects  of  interest  are  two  canonical  issues  in  sensor 
networks  research.  When  the  object  of  interest  is  static,  we  use  localization  algorithms  to 
identify  its  location.  When  the  object  of  interest  is  moving,  we  use  tracking  algorithms  to 
estimate  its  path  over  time.  Since  sensor  networks  are  often  deployed  in  remote  or  hostile 
terrains,  however,  security  becomes  another  critical  issue.  Hence  the  localization  or  tracking 
accuracy  would  go  down  as  a  result  of  the  presence  of  malicious  nodes. 

The  objective  of  this  dissertation  is  to  correctly  identify  the  malicious  nodes  during 
the  localization  and  tracking  processes.  A  novel  algorithm  based  on  relaxation  labeling  is 
presented  to  achieve  this  objective.  Our  approach  provides  a  different  perspective  from  the 
existing  literature  on  secure  localization  and  tracking.  Gurrent  literature  uses  statistical 
measures  to  perform  localization  and  tracking  as  accurately  as  possible  given  the  influence 
of  malicious  nodes.  Instead,  those  malicious  nodes  are  isolated  first,  and  use  only  data  from 
benign  nodes  to  perform  localization  and  tracking.  Both  simulations  and  field  experiments 
are  used  to  demonstrate  the  performance  of  our  algorithm. 

Novel  contributions  of  this  dissertation  are  as  follows:  1.  Extension  of  the  classic 
relaxation  labeling  algorithm  to  contain  higher-order  consistency  functions  defined  on  triples 
of  objects,  rather  than  pairs.  2.  Solution  of  the  secure  localization  problem  by  detecting 
malicious  nodes.  3.  Definition  of  the  secure  tracking  problem  in  the  presence  of  mali¬ 
cious  nodes,  and  solution  of  that  problem  combining  conventional  tracking  with  relaxation 
labeling. 
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1.1  Illustration  of  localization  and  tracking.  The  filled  circles  represent  sensor 
nodes.  In  (a),  the  star  represents  the  static  event,  and  3  nodes  are  active 
at  the  same  time.  In  (b),  the  stars  represent  the  target  positions  at  each 
time  instant  along  the  target  path,  and  one  new  node  is  activated  at  each 
successive  time  instant.  Precise  definition  of  “successive  time  instant”  will 

be  defined  later .  3 

1.2  A  schematic  illustrating  localization  and  tracking  of  a  single  target  [44].  The 
target  is  shown  as  an  asterisk,  (a)  The  target  enters  cell  A,  and  cell  A 
becomes  the  active  cell.  Nodes  in  cell  A  are  activated,  and  report  their 
measurements  to  the  CPU.  The  CPU  calculates  the  position  of  the  target. 

(b)  The  localization  information  when  the  target  is  inside  A  is  used  by  the 
CPU  to  predict  the  future  position  of  the  target.  Cell  B  is  the  new  cell  that 

the  target  is  likely  to  enter,  hence  all  nodes  in  cell  B  will  be  activated.  .  .  5 

2.1  Illustration  of  how  at  least  3  sensors  are  needed  to  resolve  ambiguity  in  a  2D 

problem.  In  (a),  two  sensor  nodes  create  ambiguity;  while  in  (b),  a  unique 
solution  can  be  found .  12 

3.1  Illustration  of  the  particle  filter  process.  In  (a),  we  are  given  a  set  of  samples 

x^(i)  and  weights  In  this  toy  example,  we  set  Ng  =  5.  From  (a)  to 

(b)  is  what  we  call  the  update  stage.  We  can  use  (3.22)  to  calculate  the  new 
samples  in  (b).  Note  that  from  (a)  to  (b),  the  weights  are  unchanged.  From 
(b)  to  (c),  we  calculate  new  weights  using  (3.23).  Note  that  from  (b)  to  (c), 
the  samples  are  unchanged.  Finally,  to  overcome  the  degeneracy  problem, 
we  perform  a  resampling  to  obtain  the  new  set  of  weights  q‘^{i)  and  samples 


x^(i) .  29 

3.2  Illustration  of  the  resampling  process.  After  we  build  two  cumulative  distri¬ 
bution  functions,  Q  in  (a)  and  T  in  (b),  we  begin  with  i  =  1  and  j  =  1.  As 


we  increment  j,  we  compare  c(j)  and  t{i).  If  c{j)  >  t(i),  we  will  increment  i 
until  t{i)  >  c{j)  again.  Every  time  we  increment  i,  we  will  set  x^(i)  =  x^*(j) 
and  q‘^{j)  =  On  the  other  hand,  if  c(j)  <  t{i),  we  will  do  nothing  except 
incrementing  j  until  c{j)  >  t{i)  again . 
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3.3  We  have  activated  6  sensor  nodes  consecutively,  which  are  denoted  as  0 
through  5.  The  lower  path  is  the  true  target  path;  while  the  upper  one 
is  fictitious.  In  this  scenario,  two  malicious  nodes  report  that  the  target  is 
most  likely  on  the  upper  path,  i.e.  Nodes  3  and  5  are  malicious.  Nodes  0,  1, 

2  &  4  report  correctly  the  range  to  the  actual  target  position,  lying  on  the 
true  path .  36 

4.1  An  scene  labeling  example  to  illustrate  the  relaxation  labeling  process.  In 
the  scene  we  have  three  objects:  object  1  is  a  circle,  object  2  is  a  square  and 


object  3  is  a  triangle .  39 

5.1  Some  example  parameter  settings  of  equation  (5.4) .  48 

5.2  Some  example  parameter  settings  of  equation  (5.8) .  51 


5.3  A  sensor  network  consisting  of  four  nodes.  Node  1  and  node  4  are  malicious; 
while  node  2  and  node  3  are  benign.  In  this  example,  node  2  and  node  3 
are  equidistant  to  both  the  true  event  and  the  fictitious  event.  Hence  they 
appear  to  be  consistently  reporting  on  the  fictitious  event.  Ill-conditioned 


cases  like  this  have  been  excluded  in  our  simulations .  53 

5.4  Simulation  Example  of  7  nodes.  Two  of  the  nodes  are  malicious .  55 

5.5  Convergence  of  P(b) .  56 
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5.8  Failure  rate  for  a  network  of  various  number  of  nodes.  At  each  network  size, 

only  2  nodes  are  malicious .  59 

5.9  Deployment  of  sensor  nodes  in  the  field  (Unit:  4  feet).  The  sensor  nodes  are 

denoted  as  circles,  while  the  event  node,  positioned  at  (5,5),  is  denoted  as  a 
cross .  59 


5.10  For  each  measurement  at  sensor  nodes,  we  plot  a  circle  using  the  range  es¬ 
timates.  Note  that  the  event  node  is  at  the  center  of  the  field,  (5,5).  The 
intersections  of  most  of  the  circles  are  either  on  or  near  the  event  node  due 

to  noise .  60 

5.11  The  probability  of  each  sensor  node  being  benign,  P{b).  Po{b)  and  Pi{b)  go 

down  to  0,  which  means  that  these  two  nodes  cannot  be  benign .  61 

5.12  Illustration  of  the  voting  algorithm  (Reproduced  from  [46]) .  62 

5.13  Comparison  of  the  performance  of  the  relaxation  labeling  algorithm  and  the 
voting  algorithm  in  a  20- node  network.  In  (a),  only  one  node  is  malicious. 

In  (b),  (c)  and  (d),  the  number  of  malicious  nodes  increase  from  seven  to 
nine.  Note  that  both  algorithms  perform  equally  well  when  there  are  2  - 
6  malicious  nodes.  Since  the  experiments  are  repeated  for  100  times,  each 
with  a  different  set  of  random  node  locations,  overlapping  results  are  marked 
at  the  same  spots  in  (a)  -  (d).  The  voting  algorithm  occasionally  concludes 
that  the  event  is  actually  at  the  fictitious  event  location,  (7.0,  7.0).  However, 
our  algorithm,  relaxation  labeling,  correctly  localize  the  event  at  (3.0, 3.0)  in 
every  case. 
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5.14  Histogram  of  e  for  all  triples  in  a  seven-node  simulation  environment.  Scale 

is  [0,10]  X  [0,10],  noise  variance  =  10“®.  The  histogram  of  Ca  is  shown 
in  red,  while  part  of  the  histogram  of  €b  is  shown  in  blue.  Note  that  the 
histogram  of  ei,  larger  than  1  is  not  shown  because  the  maximal  has  a 
much  larger  scale  than  any  of  Ca .  72 

5.15  Histogram  of  {cai,  •  •  •  ,ea600o})  =  10“®.  The  maximal  value  on  the  hori¬ 

zontal  axis  is  chosen  relative  to  the  maximal  value  of  Cq  for  clearer  presentation.  73 


5.16  {cai,---  ,ea60oo})  =  10“^.  The  maximal  value  on  the  horizontal  axis  is 

chosen  relative  to  the  maximal  value  of  Ca  for  clearer  presentation .  73 

5.17  {cai,---  ,ea60oo})  =  10“^.  The  maximal  value  on  the  horizontal  axis  is 

chosen  relative  to  the  maximal  value  of  Ca  for  clearer  presentation .  74 


5.18  (a)  The  maximum  (as  represented  by  ^  ^ai)  versus  noise  variance 
cj^  (b)  The  mean  (as  represented  by  versus  noise  variance 

cj^  .  75 


5.19  Some  example  PDFs  of  an  exponential  random  variable .  75 

5.20  Effect  of  ai  and  02  on  the  system  performance.  We  choose  a  higher  noise 

variance,  =  1.0  x  10“^  in  this  figure.  There  is  little  or  no  effect  on  the 
system  performance  while  the  values  of  ai  and  02  are  being  changed.  ...  79 


5.21  Number  of  iterations  required  to  reach  convergence  at  various  number  of 
network  sizes.  We  can  see  that  the  system  converges  faster  as  n  gets  larger. 
However,  the  speed  of  convergence  does  not  get  any  larger  as  the  network 
reaches  a  certain  size  (20  in  this  experiment) .  84 

6.1  At  t  =  0,  we  assume  P(x®)  is  known,  and  this  information  is  passed  to  the 
two  nodes  activated  at  t  =  1.  Using  the  particle  filter  algorithm,  node  1  and 
node  2  can  each  calculate  p{'x^\z^),  and  those  information  is  passed  to  node 
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produce  two  distinctive  ^(x^jz^).  Note  that  in  this  model,  each  node  (e.g. 
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6.5  We  can  activate  three  nodes  at  each  time  step  during  the  tracking  process. 

Each  successor  node  will  have  three  inputs,  hence  it  can  produce  three  dif¬ 
ferent  outputs.  The  inconsistency  between  its  three  outputs  can  be  used  in 

the  relaxation  labeling  process .  93 

6.6  Illustration  of  the  relaxation  labeling  algorithm.  The  rectangular  box  stands 
for  relaxation  labeling  algorithm.  At  each  time  step  (except  time  0),  we 
activate  3  nodes.  Eor  every  3  nodes  (except  time  0),  we  perform  relaxation 
labeling  algorithm  to  detect  malicious  nodes.  After  removing  malicious  nodes 
and  average  the  results  from  benign  nodes,  the  relaxation  labeling  algorithm 


produces  a  correct  result  and  pass  it  on  to  the  next  time  step .  94 

6.7  Tracking  of  the  target  in  a  two-dimensional  space,  x,  by  using  three  sensor 
nodes.  We  activate  three  sensor  nodes  at  each  time.  The  three  sensor  nodes 
act  independently  to  perform  tracking .  97 
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6.8  Tracking  of  the  target  by  using  four  sensor  nodes.  The  experimental  setup 
is  identical  to  Figure  6.7.  The  only  difference  is  that  we  activate  four  nodes. 

6.9  Tracking  of  the  target  by  using  five  sensor  nodes.  This  experiment  is  similar 

to  Figure  6.7  except  that  we  activate  five  nodes . 

6.10  Illustration  of  the  node  activation  results.  The  states  (only  positions  are 

shown)  of  the  target  are  shown  as  diamonds.  At  t  =  20,  20  nodes  are  chosen 
to  calculate  the  mutual  information,  which  are  shown  as  circles.  Among 
them,  three  nodes  that  have  the  highest  mutual  information  will  be  selected 
as  active  nodes  at  t  =  21,  which  are  shown  as  filled  circles.  We  can  see  from 
the  trajectory  of  the  target  at  t  =  19,  20  that  the  current  best  estimate  of 
velocity  is  in  the  northwest  direction.  Hence  the  three  nodes  which  are  in 
the  northwest  direction  of  the  target  position  at  t  =  20  are  activated.  This 
agrees  with  the  highest  mutual  information  that  we  calculated . 

6.11  Illustration  of  the  node  activation  algorithm.  This  is  from  the  same  exper¬ 
iment  on  Figure  6.10,  except  that  the  result  here  is  extracted  at  t  =  30. 
Unlike  Figure  6.10,  it  is  harder  to  see  where  the  target  is  heading  based  on 
its  trajectory  at  t  =  28, 29,  30.  Hence  the  three  nodes  activated  at  t  =  30  do 
not  appear  to  fall  on  one  particular  spot  that  the  target  is  likely  heading.  . 

6.12  Adding  malicious  nodes  to  the  sensor  network.  We  choose  sensor  nodes 

and  to  be  malicious.  In  other  words,  there  is  one  malicious 
nodes  (out  of  three)  at  t  =  10, 20, 30, 40, 50 . 

6.13  Comparison  of  relaxation  labeling  and  averaging.  The  solid  line  is  obtained 

by  averaging  the  three  paths  in  Figure  6.12.  The  dotted  line  is  obtained  by 
removing  malicious  nodes  using  relaxation  labeling . 

6.14  Tracking  result  with  malicious  nodes.  We  activate  four  sensor  nodes,  and 

there  is  one  malicious  node  (which  can  sense  the  target)  att  =  10, 20, 30, 40, 50, 
and  that  node  remains  active  (malicious)  for  only  one  time  step . 

6.15  Comparison  of  the  tracking  performance.  The  solid  line  is  obtained  by  av¬ 

eraging  the  result  in  Figure  6.14,  and  its  MSE  is  0.7090.  The  dashed  line  is 
obtained  by  using  relaxation  labeling  to  remove  malicious  nodes.  Its  MSE  is 
0.3669,  which  is  smaller  than  the  MSE  of  averaging . 

6.16  Tracking  result  with  malicious  nodes  by  activating  five  sensor  nodes . 

6.17  Comparison  of  the  tracking  performance  for  five  active  nodes . 

6.18  Tracking  performance  under  the  influence  of  malicious  nodes.  Note  that  no 

secure  tracking  algorithm  is  performed  in  this  figure . 

6.19  Probability  of  being  malicious  nodes  for  the  five  nodes  at  t  =  10  and  another 
five  nodes  at  t  =  11.  Hence  we  have  5  x  2  =  10  P{m)  here.  The  first  five 
probabilities  are  for  P{X)  at  t  =  10.  The  last  five  are  for  t  =  11.  We  can  see 
that  at  t  =  10,  the  first  node  is  found  to  be  malicious,  while  at  t  =  11,  the 
second  node  is  found  to  be  malicious.  This  agrees  with  the  actual  data.  .  . 

6.20  Tracking  performance  under  the  influence  of  malicious  nodes.  Two  secure 
tracking  results  are  shown  in  this  figure.  One  is  averaging  (shown  in  solid 
line),  and  the  other  is  relaxation  labeling  (shown  as  the  dashed  line).  The 
MSE  for  averaging  is  1.2615.  The  MSE  for  relaxation  labeling  is  0.7331.  .  . 
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7.1  A  possible  scenario  where  a  target  travels  through  some  obstacles  where  no 

sensor  nodes  are  deployed.  In  this  scenario,  the  target  is  traveling  from  the 
left  to  the  right.  The  shaded  area  is  where  there  is  no  sensor  nodes  are 
deployed.  For  example,  consider  that  the  shaded  area  is  a  river.  The  target 
is  a  tank  that  we  are  tracking.  There  are  no  sensor  nodes  deployed  in  the 
river,  however,  the  tank  can  successfully  pass  through  the  river.  Assume  that 
it  takes  one  time  step  for  the  tank  to  pass  the  river,  we  cannot  determine  the 
location  of  the  tank  at  t  =  4.  The  problem  is  to  determine  which  nodes  to 
the  right  of  the  unavailable  area  should  we  activate,  at  t  =  5,  in  order  not  to 
miss  the  tank?  .  110 

7.2  Candidate  areas  for  node  activations.  The  candidate  area  is  calculated  based 
on  an  estimated  speed  and  an  estimated  turning  angle  of  the  target.  First, 
the  candidate  area  for  f  =  4  is  calculated.  Then  the  candidate  area  for  t  =  5 
is  calculated  based  on  the  candidate  area  for  t  =  4.  Those  nodes  inside  the 
candidate  area  for  t  =  5  will  be  activated  to  detect  possible  target  appearances. Ill 

7.3  Maximum  likelihood  estimates  of  the  target  locations.  We  predict  the  maxi¬ 
mum  likelihood  target  location  of  the  target  at  t  =  4.  Based  on  the  estimated 
target  location  at  t  =  4,  we  predict  the  maximum  likelihood  location  of  the 
target  at  t  =  5.  We  only  activate  the  nodes  within  a  certain  neighborhood 
of  the  estimated  target  location  at  t  =  5.  The  size  of  the  neighborhood  can 
be  adjusted  according  to  our  confidence  of  the  prediction  of  the  likely  target 
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Chapter  1 

Introduction 

1.1  What  Are  Sensor  Networks? 

Recent  advances  in  microelectronic  technology  have  enabled  a  new  generation 
of  sensor  networks  which  holds  the  potential  to  revolutionize  our  economy  and  society 
[23,  16,  58,  50].  Large  numbers  of  tiny  and  smart  sensor  nodes  will  be  sprayed  onto  roads, 
bridges,  factories  and  forests.  These  “sensor  nodes”  collaboratively  monitor  all  kinds  of 
information,  including  light,  sound,  temperature,  humidity,  acceleration,  voltage,  motion, 
radiation,  etc.  Using  the  analogy  of  the  human  nervous  system,  the  numerous  sensor  nodes 
are  just  like  our  digital  receptors  to  monitor  the  physical  world.  Moreover,  the  information 
collected  by  the  sensor  networks  will  be  transmitted  over  the  networks,  analogous  to  our 
digital  spinal  cord  and  digital  neural  networks.  Finally,  the  information  will  be  processed 
by  digital  computers,  analogous  to  our  digital  brain.  Hence  a  “digital  nervous  system”  can 
be  created  for  us  to  instrument  the  physical  world  at  a  global  scale. 

A  sensor  node  is  made  up  of  four  basic  components:  a  sensing  unit,  a  processing 
unit,  a  transceiving  unit  and  a  power  unit  [Ij.  The  sensing  unit  may  have  any  number  of 
sensors  which  can  detect  different  types  of  signals  (light,  sound,  temperature,  etc.).  The 
sensing  unit  also  may  include  analog-to-digital  converters  to  convert  the  measurements  into 
digital  data.  The  processing  unit,  which  usually  contains  a  small  data-storage  unit,  coor¬ 
dinates  the  cooperation  with  other  sensor  nodes  and  manages  the  activation/hibernation 
schedule.  The  transceiving  unit  may  contain  wired  or  wireless  communication  components 
in  order  to  report  sensor  measurements.  Finally,  the  power  unit  may  use  power  sources 
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like  battery  or  solar  energy  [59],  depending  on  the  cost  of  manufacturing  and  application 
needs.  Currently,  companies  like  Arch  Rock,  Crossbow,  Dust  Networks,  Millennial  Net, 
and  Moteiv  offer  various  types  of  sensor  nodes,  and  [28]  gives  an  overview  of  the  hardware 
platform  of  sensor  nodes. 

Sensor  networks  can  be  used  in  many  different  applications,  ranging  from  scientific 
to  educational  to  military.  For  example,  in  environmental  applications,  sensor  networks  can 
be  used  to  track  wild  birds  on  a  deserted  island  [56].  Before  the  advent  of  sensor  networks, 
estimating  the  number  of  wild  birds  was  a  daunting  task  because  birds  migrate  over  a  large 
area.  Deploying  a  sensor  network  can  easily  solve  this  problem  by  covering  the  entire  area 
with  low-cost  sensor  nodes  that  are  equipped  with  sound  sensors  to  keep  track  of  the  specific 
bird  chirp.  Another  example  is  to  embed  sensor  networks  in  the  physical  environment  to 
construct  a  developmental  problem-solving  environments  for  early  childhood  education  [54] . 
This  allows  “person  to  physical  world  communications”  as  opposed  to  traditional  “person 
to  person”  or  “person  to  computer”  communications.  This  is  a  natural  application  of  sensor 
networks  as  young  children  learn  by  exploring  and  interacting  with  objects  such  as  toys  in 
their  environment.  Finally,  a  military  application  is  the  VigilNet  system  [27].  He  et  al.  [27] 
describe  the  design  and  implementation  of  a  sensor  network  system,  VigilNet,  specifically 
for  military  surveillance  purposes.  In  the  VigilNet  project,  70  sensor  nodes  are  deployed  in 
a  280-foot  long  perimeter  in  a  grassy  field  that  would  typically  represent  a  critical  military 
“choke  point” .  The  VigilNet  system  can  track  moving  vehicles  in  the  surveillance  area  using 
the  sensor  network.  These  three  example  applications  of  sensor  networks  are  among  the 
new  applications  which  continue  to  be  discovered  and  envisioned  by  scientists  around  the 
world. 

1.2  Localization  and  Tracking  in  Sensor  Networks 

In  order  to  realize  the  immense  potential  of  sensor  networks,  one  of  the  critical 
capabilities  of  a  sensor  network  is  to  keep  track  of  the  specific  objects  of  interest.  If  the 
object  of  interest  is  static,  that  is,  its  spatial  coordinates  do  not  change  in  time,  we  denote 
it  as  an  event.  The  algorithm  that  the  sensor  network  uses  to  identify  the  spatial  position  of 
the  static  event  is  denoted  as  an  event-localization  algorithm,  or  more  briefly,  a  localization 
algorithm.  After  the  event  has  been  detected  and  localized,  it  may  move  or  be  moved  by 
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an  external  force.  We  denote  a  moving  object  of  interest  as  a  target. 

In  an  event-localization  scenario,  three  or  more  sensor  nodes  will  be  turned  on  at 
all  times  (we  will  explain  the  details  in  Chapter  2).  Once  an  event  occurs,  those  sensor 
nodes  which  have  detected  the  event  will  report  their  measurements.  Based  on  the  mea¬ 
surements,  we  can  run  localization  algorithms  to  find  out  the  location  of  the  event. 

When  the  target  is  moving,  we  may  use  target-tracking  algorithms  (or  simply  track¬ 
ing  algorithms)  to  continuously  estimate  its  current  position  at  real-time  rates. 

Figure  1.1  gives  illustrations  of  localization  and  tracking  scenarios. 


(a)  localization 
t  =  m  t  =  m 


(b)  tracking 


t  =  k  -\-3  t  =  k  4 


Figure  1.1:  Illustration  of  localization  and  tracking.  The  filled  circles  represent  sensor  nodes. 
In  (a),  the  star  represents  the  static  event,  and  3  nodes  are  active  at  the  same  time.  In  (b), 
the  stars  represent  the  target  positions  at  each  time  instant  along  the  target  path,  and  one 
new  node  is  activated  at  each  successive  time  instant.  Precise  definition  of  “successive  time 
instant”  will  be  defined  later. 


Although  the  purposes  of  both  localization  and  tracking  algorithms  are  to  estimate 
the  spatial  position  of  the  object  of  interest,  subtle  differences  between  the  two  algorithms 
exist.  We  will  explain  the  details  of  localization  and  tracking  later  in  this  dissertation.  To 
give  the  reader  an  overview,  the  differences  between  a  localization  algorithm  and  a  tracking 
algorithm,  as  defined  in  this  dissertation,  are: 

•  The  localization  process  happens  at  the  same  time  instant;  while  the  tracking  process 
takes  many  time  steps. 
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•  A  localization  algorithm  usually  requires  three  or  more  sensor  nodes  activated  at  any 
time;  while  a  tracking  algorithm  may  require  as  few  as  one  node  [47].  Details  of 
localization  and  tracking  will  be  given  in  later  sections. 

In  [44] ,  Li  et  al.  provide  a  general  framework  to  detect  an  event  using  localization 
algorithms  and  subsequently  track  its  movement  using  tracking  algorithms,  as  illustrated  in 
Figure  1.2.  In  Figure  1.2,  the  monitored  area  is  divided  into  different  cells.  A  cell  is  defined 
as  a  local  collection  of  nodes,  all  of  which  are  active  at  a  given  time.  In  order  to  preserve 
battery  power,  not  all  cells  are  simultaneously  activated.  Note  that  although  Figure  1.2 
shows  disjoint  cells,  this  definition  does  not  necessarily  require  disjoint,  non-overlapping 
cells.  They  also  assume  (as  does  this  dissertation)  that  there  is  a  central  processing  unit 
(CPU)  that  collects  the  information  from  all  of  the  sensor  nodes  in  a  specific  collection  of 
cells.  The  basic  approach  for  localization  and  tracking  a  single  target  is  reproduced  and 
summarized  as  follows  [44]  (for  complete  details,  please  refer  to  [44]): 

1.  Some  and  perhaps  all  of  the  nodes  in  cell  A  detect  the  target.  These  nodes  are 
the  “active  nodes”,  and  cell  A  is  the  “active  cell”.  The  active  nodes  report  their 
measurements  to  the  CPU. 

2.  At  each  time  instant,  the  CPU  determines  the  location  of  the  target  from  the  mea¬ 
surements  of  the  active  nodes. 

3.  The  CPU  uses  locations  of  the  target  at  the  N  successive  time  instants  to  predict  the 
location  of  the  target  at  M  future  time  instants. 

4.  The  predicted  positions  of  the  target  are  used  by  the  CPU  to  activate  new  cells  that 
the  target  is  likely  to  enter.  This  is  illustrated  in  Figure  1.2(b)  where  cell  B  represents 
the  region  that  the  target  is  likely  to  enter  after  the  current  active  cell  (cell  A  in  Fig 
1.2(a)). 

5.  Once  the  target  is  detected  in  the  new  cell,  it  is  designated  as  the  new  active  cell  and 
the  nodes  in  the  original  active  cell  (cell  A  in  Fig  1.2(a))  may  be  put  in  the  standby 
state  to  conserve  energy. 

The  steps  1-5  are  repeated  for  the  new  active  cell,  and  this  forms  the  basis  of 
a  sensor  network  monitoring  system.  Chu  et  al.  [13]  also  provides  a  similar  mechanism  for 
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the  localization  and  tracking  of  targets. 

Note  that  once  the  target  is  moving,  it  is  possible  to  activate  all  of  the  sensor 
nodes  in  the  network  and  use  those  nodes  who  detect  the  target  to  perform  localization  at 
each  successive  time  step.  In  other  words,  we  could  activate  all  nodes  in  the  network,  and 
performing  “tracking”  by  localization  at  each  time  step  without  using  the  actual  tracking 
algorithm.  However,  activating  all  of  the  sensor  nodes  in  the  network  will  consume  much 
more  power  than  needed  since  at  each  time  step  only  those  nodes  near  the  current  target 
position  can  detect  it.  Those  nodes  who  are  not  in  the  vicinity  of  the  target  are  turned  on 
unnecessarily.  Hence  tracking  algorithms  allow  us  to  only  activate  necessary  nodes.  Those 
nodes  not  in  the  active  cell  can  be  put  in  the  standby  state  to  conserve  energy. 


(a)  (b)  ^ 


Figure  1.2:  A  schematic  illustrating  localization  and  tracking  of  a  single  target  [44].  The 
target  is  shown  as  an  asterisk,  (a)  The  target  enters  cell  A,  and  cell  A  becomes  the  active 
cell.  Nodes  in  cell  A  are  activated,  and  report  their  measurements  to  the  CPU.  The  CPU 
calculates  the  position  of  the  target,  (b)  The  localization  information  when  the  target  is 
inside  A  is  used  by  the  CPU  to  predict  the  future  position  of  the  target.  Cell  B  is  the  new 
cell  that  the  target  is  likely  to  enter,  hence  all  nodes  in  cell  B  will  be  activated. 


1.3  Security  in  Sensor  Networks 


Sensor  networks  are  often  deployed  in  remote  or  hostile  terrains,  hence  they  are 
often  susceptible  to  various  attacks.  Ilyas  et  al.  [33]  provide  several  survey  articles  on  dif- 
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ferent  types  of  security  issues  in  sensor  networks.  For  the  purpose  of  this  dissertation,  we 
define  two  types  of  sensor  nodes:  malicious  and  benign.  The  purpose  of  malicious  nodes 
is  to  lower  the  accuracy  of  the  localization  and  tracking  processes.  The  adversary  may 
physically  capture  some  unknown  number  of  existing  benign  nodes,  and  replace  them  with 
malicious  ones.  Or,  the  adversary  may  remotely  login  to  the  sensor  nodes  and  turn  them 
into  malicious  nodes,  assuming  that  such  network  connection  is  available. 

Furthermore,  we  define  malicious  nodes  to  be  able  to  authenticate  correctly  with 
the  sensor  network.  The  malicious  nodes  also  use  the  same  type  of  encryption  codes  to 
communicate  with  the  other  nodes  in  the  network.  Malicious  nodes  can  inject  false  local¬ 
ization  or  tracking  reports  into  the  network  without  being  detected  using  authentication  or 
encryption  methods.  In  this  dissertation,  we  will  provide  novel  algorithms  to  detect  those 
malicious  nodes,  which  are  assumed  to  be  less  than  half  of  the  total  nodes  inside  the  sensor 
network,  based  solely  on  their  behavior. 

1.4  Overview  of  the  Dissertation 

In  dissertation,  two  interrelated  problems  are  presented:  secure  localization  and 
secure  tracking.  In  both  localization  and  tracking,  there  is  an  unknown  number  of  malicious 
nodes  in  the  network  (we  generally  assume  that  the  number  of  malicious  nodes  are  fewer 
than  benign  nodes).  The  malicious  nodes  will  attempt  to  lower  the  accuracy  of  localization 
and  tracking  by  reporting  that  the  target  is  at  a  fictitious  location  which  is  different  from 
the  true  target  location.  Furthermore,  we  assume  that  malicious  nodes  will  agree  (collude) 
on  that  fictitious  position.  Our  objective  is  to  detect  those  malicious  nodes. 

We  propose  a  novel  algorithm  to  detect  malicious  nodes,  in  both  localization  and 
tracking  scenarios  [11,  9,  10].  The  rest  of  this  dissertation  is  to  explain  our  algorithm,  and 
it  is  organized  as  follows: 

In  Chapter  2,  we  will  provide  the  background  on  secure  localization.  Similarly, 
background  on  the  secure  tracking  problem  is  provided  in  Chapter  3. 


Chapter  4  gives  the  background  on  the  relaxation  labeling  algorithm.  The  relax- 
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ation  labeling  algorithm  is  a  prominent  method  to  reduce  ambiguity,  and  we  will  extend  it 
in  Chapter  5  and  Chapter  6. 

Chapter  5  is  our  proposed  algorithm  to  solve  the  secure  localization  problem.  It 
corresponds  to  the  problem  defined  in  Chapter  2.  Chapter  6  is  our  proposed  algorithm  to 
solve  the  secure  tracking  problem,  and  it  corresponds  to  the  problem  defined  in  Chapter  3. 

Finally,  Chapter  7  is  the  conclusion  and  discussions  on  future  work. 


Chapter  2 


Background  on  Localization 

2.1  Sensor  Models 

In  order  to  estimate  the  location  of  events,  an  adequate  model  of  the  sensors  is 
needed.  Given  an  adequate  sensing  model,  we  may  derive  location  estimation  algorithms 
for  such  sensors.  We  denote  the  time-dependent  sensor  measurement  from  sensor  node  i 
as  Zi{t).  We  denote  x(t)  as  the  unknown  event-related  parameters  that  we  would  like  to 
estimate,  and  in  this  dissertation  x(t)  is  the  unknown  event  location  in  a  two-dimensional 
space.  One  general  model  to  relate  Zi{t)  to  x(f)  is  given  as  [13] 

Zi{t)  =  i  =  (2.1) 

where  /i  :  x  M  — >  M  is  a  (possibly)  non-linear  function  which  depends  on  x(t)  and  Wi.  Wi 
is  additive,  signal-independent  Gaussian  noise  whose  variance  cr^  is  assumed  to  be  known. 
Existing  literature  on  sensor  networks  commonly  uses  two  special  cases  of  the  sensor  model 
given  in  (2.1):  acoustic  amplitude  sensors  and  acoustic  array  sensors.  We  do  not  specifically 
define  the  form  of  h  here  because  they  could  be  different  in  both  types  of  sensors.  Please 
see  details  in  the  next  sections. 


2.1.1  Acoustic  Amplitude  Sensors 

Acoustic  amplitude  sensors  provide  a  range  estimate  as  follows  [40]:  First,  we 
assume  that  acoustic  signals  propagate  isotropically.  We  denote  the  known  sensor  node 
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position  as  for  the  ith  sensor  node.  For  acoustic  amplitude  sensors,  (2.1)  becomes 

a 

—  7  ..a  “1“ 

l|x-^i||2 

where  a  is  the  given  amplitude  of  the  signal  at  the  event  position  x,  a  is  a  known  attenua¬ 
tion  coefficient,  and  ||||  is  the  Euclidean  norm.  Equation  (2.2)  means  that  Zi  should  go  up 
as  a  goes  up,  except  for  the  influence  of  noise.  Similarly,  Zi  will  be  smaller  if  the  distance 
from  the  node  to  the  event  is  farther,  except  for  the  influence  of  noise.  Note  that  x  and 
are  both  vectors  denoting  spatial  locations,  where  x  is  the  (unknown)  spatial  location  of 
the  event,  and  is  the  known  spatial  location  of  the  sensor  node  i.  Eor  the  purpose  of  this 
dissertation,  both  x  and  are  in  the  two-dimensional  space  (more  details  will  be  given  in 
Eigure  2.1). 

Note  that  the  model  in  (2.2)  is  also  adopted  in  modeling  types  of  sensors  other 
than  acoustic  amplitudes.  Eor  example,  [37]  uses  this  model  in  radiation  sensors. 


In  this  dissertation,  we  assume  that  the  sensor  node  has  a  limited  sensing  range. 
When  the  event  is  out  of  the  sensing  range  of  the  node,  or  when  the  event  amplitude  is  too 
small,  the  sensor  node  will  not  be  able  to  detect  such  an  event.  Hence  we  do  not  consider 


the  exceptional  case  when 
term,  Zi  zzwi. 


~  0  and  the  received  signal  consists  only  of  the  noise 


2.1.2  Acoustic  Array  Sensors 

Acoustic  array  sensors  are  essentially  arrays  of  microphones,  and  signal  processing 
algorithms  such  as  beam-forming  [20,  12]  may  be  used  to  estimate  the  location  of  the  event. 
Acoustic  array  sensors  can  measure  the  Direction  of  Arrival  information  of  the  target,  and 
are  another  example  of  the  general  sensor  model  in  (2.1).  However,  in  this  dissertation,  we 
assume  that  each  node  has  a  single  sensor,  not  an  array.  Readers  interested  in  array  sensors 
should  refer  to  [61]  and  the  references  therein. 


2.2  Localization  Algorithms 

The  sensor  model  in  (2.2)  is  adaptable  to  different  types  of  sensors  since  it  has  been 
used  to  model  acoustic  amplitude  sensors  [40]  and  radiation  sensors  [37].  As  mentioned  in 
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Chapter  1,  the  sensor  in  a  sensor  node  could  be  a  sound  sensor,  a  pressure  sensor,  a  light 
sensor,  and  so  on.  In  order  to  model  as  many  different  types  of  sensors  as  possible,  the  sensor 
model  in  (2.2)  is  the  most  suitable  one.  Equation  (2.2)  simply  states  that  an  event,  which 
has  an  associated  event  amplitude,  decays  as  the  distance  gets  larger  (to  the  power  of  the 
attenuation  constant).  This  can  be  used  to  model  many  physical  phenomena.  Therefore, 
we  will  use  (2.2)  as  the  sensor  model. 

2.2.1  Localization  of  a  Single  Event 

The  most  common  localization  algorithm  is  triangulation  [61].  Similar  concepts 
of  triangulation  by  solving  a  linear  system  can  also  be  found  in  Global  Positioning  Systems 
(GPS)  [62]  to  locate  a  GPS  module  or  Time  of  Arrival  (TO A)  techniques  in  cellular  phone 
systems  [63].  In  triangulation,  we  form  a  system  of  linear  equations  of  the  unknown  x.  We 
may  estimate  x  by  inverting  the  system  matrix,  as  described  below. 

We  assume  that  a  single  target  exists  and  is  detected  by  sensor  i.  We  introduce  a 
term  range  for  each  sensor  node,  and  it  is  denoted  as  Si  which  is  a  measurement  of  jjx  — 
Letting  a  =  4  in  (2.2),  Si  is  defined  by 


Or  equivalently. 


(2.3) 


(2.4) 


Note  that  in  (2.3)  and  (2.4),  the  noise  term  in  (2.2)  is  implicit  in  the  measurement  Zi 


Finding  the  unknown  location  of  the  event,  x,  can  then  be  posed  as  minimizing 
the  following  objective  function 


n 

X  =  argmin  {Sj  —  jjx  —  ^J])^  .  (2.6) 

X  .  ., 

2=1 

Triangulation  is  a  method  to  approximate  the  solution  of  (2.6).  Since  we  assume 
sensor  nodes  are  deployed  in  a  two-dimensional  space,  we  denote  the  known  location  of 
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each  sensor  node  as  =  [xi  and  the  unknown  location  of  the  event  as  x  =  [xq  yo]^- 

Then 


{xi  -  xof  +  {yi  -  yof  =  5f ,  i  =  l,2,---,n  (2.7) 

Equation  (2.7)  basically  says  that  using  the  location  of  the  sensor  node  as  the  center  and 
the  range  6i  as  the  radius,  we  can  plot  a  circle.  The  location  of  the  event  lies  somewhere 
on  the  circle.  In  (2.7),  if  we  subtract  the  i  =  1  equation  from  the  the  other  equations,  we 
obtain  a  system  of  linear  equations  in  xq  and  yo  of  the  form 


2{xi  -  xi)xo  +  2(yi  -  yi)yo  =  {xf  -  x\)  +  {yf  -yl)  +  5l-  Sf,  i  =  2, . . . ,  n  (2.8) 


Defining 


A  = 


2{x2-xi)  2(y2-yi) 


b  = 


2{xn-xi)  2(y„-yi) 
{xj  -  xf)  +  (y|  -  yl)  +  df  - 


{xl  -  xj)  +  (y2  -  yf)  +  5l-5l 
we  get  a  linear  system  of  equations 


Ax  =  b.  (2.9) 

where  A  is  referred  to  as  the  “system  matrix” .  If  A  is  invertible,  the  location  of  the  event 
can  be  estimated  by 


i  =  A“^b.  (2.10) 

In  general,  we  need  at  least  2  sensors  in  a  one-dimensional  space  to  resolve  am¬ 
biguity.  Similarly,  we  need  at  least  3  sensors  in  a  2D  problem,  as  illustrated  in  Figure 
2.1.  Without  loss  of  generality,  we  will  assume  that  sensor  nodes  are  deployed  in  a  two- 
dimensional  space  from  this  point  on. 
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(a)  (b) 


Figure  2.1:  Illustration  of  how  at  least  3  sensors  are  needed  to  resolve  ambiguity  in  a  2D 
problem.  In  (a),  two  sensor  nodes  create  ambiguity;  while  in  (b),  a  unique  solution  can  be 
found. 


In  the  cases  when  we  have  more  than  enough  sensors  (n  >  3),  it  becomes  an  over¬ 
determined  problem,  and  a  mean-squared-error  estimate  of  the  location  of  the  event  can  be 
determined  using  the  pseudoinverse 

i  =  [A'^A]”^  A'^b.  (2.11) 

under  the  assumptions  that  the  inverse  of  A'^A  exists.  Geometrically,  since  (2.7)  is  draw¬ 
ing  a  circle,  solving  (2.9)  is  equivalent  to  estimating  the  common  intersection  of  all  of  the 
circles.  What  is  notable  is  that  one  sensor  alone  cannot  determine  the  location  of  the  event; 
it  requires  collaboration  of  multiple  sensors  to  solve  the  event-localization  problem.  Hence 
event-localization  is  a  typical  example  of  collaborative  signal  processing  [61] . 


2.2.2  Localization  of  Multiple  Events 


It  is  possible  to  extend  the  localization  algorithm  to  estimate  the  locations  of 
multiple  simultaneous  events.  In  this  case,  the  sensor  model  becomes  a  mixture 


= 


E 


+  Wi, 


(2.12) 
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in  which  aj  is  the  amplitude  of  each  event.  In  this  dissertation,  we  do  not  consider  multiple 
events  in  secure  localization  problems.  Please  refer  to  [8]  for  a  discussion  of  localization  of 
multiple  events. 

2.3  Related  Localization  Literature 

In  this  dissertation,  localization  refers  to  finding  the  location  of  the  event,  not 
the  location  of  the  sensor  node  itself.  There  exists  literature  in  which  a  sensor  network  is 
deployed  with  only  a  portion  of  the  nodes  knowing  their  locations  [32].  In  that  literature, 
localization  refers  to  using  those  nodes  who  know  their  own  locations  to  determine  the  lo¬ 
cations  of  the  rest  of  the  nodes  in  the  network  [32].  Although  related  to  this  dissertation, 
the  objective  (determining  is  distinctive  from  our  objective,  determining  x. 

In  the  robotics  literature,  there  is  a  topic  on  Simultaneous  Localization  and  Map¬ 
ping  (SLAM)  [14].  SLAM  refers  to  the  process  in  which  a  mobile  robot  explores  and  maps 
the  current  environment  while  keeping  track  of  its  own  current  position.  What  a  mobile 
robot  localizes  is  the  current  position  of  itself,  and  this  is  also  distinctive  from  what  we 
mean  by  localization  in  this  dissertation. 

Also  note  from  our  sensor  model  in  (2.2)  that  we  do  not  have  time  information 
from  the  event  since  we  cannot  know  when  the  event  occurred  in  advance.  Hence  we  cannot 
measure  how  long  it  takes  for  the  “signal”  to  propagate  from  the  event  position  to  the 
node  position.  There  exists  localization  techniques  to  determine  the  location  of  a  mobile 
subscriber  in  a  cellular  or  wireless  local  area  network  (WLAN)  environments  [53].  Those 
techniques  include  Time  of  Arrical,  Time  Difference  of  Arrival  and  Angle  of  Arrival,  but 
because  of  our  assumptions,  they  will  not  be  discussed  or  used  in  this  dissertation. 

2.4  Security  Probleui  iu  Localizatiou 

2.4.1  Problem  Statement 

Consider  the  case  when  we  have  deployed  a  sensor  network,  and  all  of  the  node 
positions  have  been  determined  and  calibrated.  All  of  the  nodes  are  active  and  listen  to 
the  environment  for  possible  event  occurrences,  and  those  who  have  detected  the  event  will 
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report  it.  An  unknown  number  of  benign  nodes  are  replaced  with  malicious  nodes  by  the 
adversary.  The  problem  of  secure  localization  [43]  is  to  correctly  identify  the  location  of  the 
event  under  the  influence  of  malicious  nodes.  Further  detail  follows. 

2.4.2  Problem  Definition 

For  the  n  nodes  that  have  detected  some  particular  event,  each  of  their  locations, 
(xj,  Hi),  and  range  reports  5i  are  known.  We  denote  a  set  of  3  nodes  as  a  triple.  Since  n  >  3, 

we  have  (  )  =  (^^3)131  =  *'”^"^1.2^3"^^  triples.  For  any  triple,  we  can  readily  obtain  an 

3 

estimate  of  the  event  location,  x  =  (To,  2/0),  using  triangulation. 

There  is  an  unknown  number  of  malicious  nodes  inside  the  network.  We  generally 
assume  that  malicious  nodes  are  fewer  than  benign  nodes  in  the  newtork.  We  define  a 
malicious  node  to  have  the  following  properties 

1.  The  objective  of  the  malicious  node  is  to  lower  the  accuracy  of  the  localization  of 
events  by  injecting  false  localization  reports  into  the  network. 

2.  All  of  the  malicious  nodes  collude  on  a  fictitious  evenf\  The  fictitious  event  location 
is  the  location  of  the  event,  as  reported  by  the  malicious  nodes.  It  is  almost  always 
different  from  the  true  location  of  the  event. 

3.  Malicious  nodes  can  successfully  authenticate  with  any  other  node  or  the  central 
server  of  the  network,  and  they  also  have  obtained  the  encryption  key  used  by  existing 
nodes.  Hence  malicious  nodes  can  successfully  communicate  with  the  network  (and 
each  other),  and  any  effort  to  use  encryption  or  authentication  to  detect  them  will  be 
futile. 

For  a  triple  whose  nodes  are  located  at 

we  denote  the  locations  of  the  nodes  of  that  particular  triple  by 


I  G  {a,  6,  c} 
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and  similarly  the  subscript  I  is  used  for  other  data  or  parameters  belonging  to  the  nodes  of 
that  particular  triple. 

Within  a  particular  triple,  we  have  the  ranges  5i,  I  G  {a,  b,  c},  calculated  using  the 
measurements  and  (2.4).  We  also  have  the  estimated  location  of  the  target,  x  =  {xo,yo), 
determined  by  (2.10) 


X  =  {xo,yo)  =  A  ^b. 

Then,  for  each  node  in  the  triple,  we  may  define  a  discrepancy^^  as  [46] 

Q  =  \Si  -  -  xiY  +  (yo  -  yiY\,  lG{a,h,c}  (2.13) 

since  the  location  of  each  sensor  node,  {xi,yi),  is  known.  In  (2.13),  the  discrepancy  has 
two  major  contributors:  noise  and  (possibly)  false  5i  generated  by  malicious  nodes.  Hence 
the  problem  is  to  classify  which  sensor  nodes  are  malicious  using  our  currently-available 
information: 

•  {xi,yi),  the  location  of  the  sensor  nodes 

•  5i,  the  range  estimate  reported  by  the  sensor  node  (or  the  measurement,  zi) 

•  (xo,yo))  the  estimated  event  location  based  on  zi 

•  ei,  the  error  term 

Note  that  we  allow  malicious  nodes  to  know  their  exact  location,  (x;,  yi),  by  either  physically 
capturing  existing  nodes  which  hold  location  information  or  breaking  into  our  system  to 
obtain  the  system  coordinates. 

2.4.3  Supplemental  Properties 
The  event  is  static 

In  this  section,  we  assume  that  the  source  of  the  event  detected  by  the  sensor 
network  is  not  moving,  or  moving  slowly  enough  that  its  motion  is  negligible  relative  to  the 
measurement  process. 
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Redundancy  of  the  sensor  measurements 

We  assume  that  when  an  event  occurs,  there  are  more  than  enough  (n  >  3)  nodes 
that  have  detected  the  event.  The  reason  why  the  required  number  of  nodes  is  greater  than 
three  is  that  we  need  more  than  one  triples.  If  n  =  3,  there  exists  only  one  triple  and  we 
cannot  use  the  consistency  among  triples  to  detect  malicious  nodes. 

Fictitious  event 

The  fictitious  event  location  has  to  fall  within  the  sensing  range  of  all  the  active 
nodes.  Otherwise,  the  benign  nodes  can  simply  tell  which  node  is  lying. 

Centralized  model 

Since  we  are  considering  a  secure  localization  scheme,  we  assume  a  centralized  sys¬ 
tem  in  which  a  central  processor  will  collect  information  from  all  active  nodes  and  calculate 
the  event  location.  This  is  due  to  the  fact  that  some  of  the  nodes  may  be  malicious,  hence 
we  need  a  central  processor  to  make  the  final  decision.  Secure  localization  methods  in  ad 
hoc  systems  [43]  are  not  covered  in  this  dissertation. 

2.4.4  Problem  Analysis 

The  challenge  of  this  problem  is  that  we  do  not  know  which  sensor  is  malicious 
in  the  first  place,  so  (xo,yo)  is  possibly  erroneous.  Hence  the  discrepancy  ei  depends  on 
whether  or  not  the  node  is  malicious.  Therefore,  the  problems  of  classification  of  sensors 
and  event  localization  are  like  a  ’’chicken  and  egg”  problem  -  whichever  comes  first  will 
solve  the  other.  If  we  knew  which  sensors  are  malicious  in  the  first  place,  we  could  use  only 
benign  sensor  measurements  to  do  the  localization  and  find  the  correct  event  location.  If 
we  knew  where  the  event  is  located,  (xq,  yo),  we  could  use  it  to  immediately  tell  which  node 
is  lying  by  verifying  6i  and  (xq  -  Xi)^  -h  (yo  - 

The  malicious  nodes  in  the  sensor  network  could  be  either  non-colluding  or  collud¬ 
ing.  For  non-colluding  attacks,  the  malicious  nodes  would  independently  generate  random 
location  estimation  reports  into  the  network;  while  colluding  nodes  could  uniformly  report 
a  new,  fictitious  event.  Current  research  have  proposed  methods  including  mean- squared- 
error  [46]  and  statistieal  filtering  [60,  65,  17]  to  detect  malicious  nodes.  When  the  attackers 
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are  non-colluding,  using  mean-squared-error  and  statistical  filtering  can  detect  those  at¬ 
tackers  who  are  outliers  to  the  community.  However,  when  the  attackers  are  colluding,  they 
become  a  consistent  group  which  is  harder  to  detect.  This  dissertation  focuses  on  colluding 
attacks  and  uses  relaxation  labeling  to  poll  every  triple  in  the  network,  and  can  detect  even 
a  large  number  of  malicious  nodes  in  the  network. 

Before  we  discuss  the  new  algorithm  in  detail,  we  would  like  to  point  out  that  the 
philosophy  of  relaxation  labeling  algorithms  is  to  use  “local  information”  to  achieve  global 
consistency.  Classical  relaxation  labeling  algorithms  use  pairs  of  objects  (local  information) 
to  resolve  ambiguity  of  the  entire  system  (global  consistency).  Due  to  the  nature  of  our 
problem,  we  extend  classical  relaxation  labeling  algorithms  to  use  triples.  However,  we  do 
not  need  quadruples  or  any  higher-order  compatibility  functions  because  the  philosophy  of 
relaxation  labeling  is  to  strive  to  use  as  local  information  as  possible.  Classical  relaxation 
labeling  algorithms  will  be  reviewed  in  Chapter  4.  A  new  relaxation  labeling  algorithm  will 
be  proposed  based  on  triples  of  sensor  nodes,  instead  of  pairs,  in  Chapter  5. 
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Chapter  3 

Background  on  Tracking 

3.1  Target  Tracking 

As  discussed  in  Section  1.2,  the  use  of  tracking  algorithms  allows  us  to  estimate  the 
current  position  of  the  target  using  fewer  sensor  nodes.  As  a  result,  less  power  is  consumed. 
In  this  chapter,  we  will  describe  the  concept  of  target  tracking  and  propose  the  problem  of 
secure  tracking. 

3.1.1  System  Models 

In  target  tracking,  we  aim  to  estimate  the  current  state  of  the  target (s)  based 
on  available  information,  including  past  target  states  and  measurements.  The  state  of  the 
target  can  be  location,  velocity,  orientation,  etc.  In  our  sensor  model,  the  nodes  can  get 
only  a  range  estimate  of  the  location  of  the  target(s).  Hence  in  this  dissertation,  we  will 
refer  to  the  location  and  velocity  of  the  target (s)  in  a  two-dimensional  space  as  the  state  of 
the  target(s).  We  extend  the  notation  of  the  previous  chapter  to  denote  {x^,  A:  G  N}  as  the 
state  of  the  target(s)  at  time  k,  where  y^]'^ ■ 

In  target  tracking  using  sensor  nodes  as  described  in  the  previous  chapter,  two 
models  form  the  foundation  of  all  algorithms:  the  motion  model  of  the  target  positions 
and  the  measurement  model  of  the  sensor  nodes.  A  good  motion  model  of  the  target  will 
certainly  facilitate  the  extraction  of  the  information  about  the  target  states  from  sensor- 
node  observations  [45].  Consider  the  evolution  of  state  sequence  {x^,A:  G  N}  of  a  target 
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given  by 


xfc  ^  ffc(xfc-i^v'=-i)  (3.1) 

where  is  the  state  of  the  target  at  time  k,  f  ^  x  ^  is  a  possibly  nonlinear  func¬ 
tion,  and  is  a  noise  sequence.  models  the  unpredictable  disturbances  [4]  while 

the  target  is  moving.  The  disturbances  are  assumed  independent  of  the  target  motion,  and 
we  do  not  assume  any  relation  between  the  elements  in  the  noise  sequence.  Hence  is 
an  independently  and  identically  distributed  (i.i.d.)  noise  sequence.  Note  that  in  (3.1), 
depends  only  on  x^“^,  but  not  on  any  of  the  previous  states  {x*, i  =  1,  •  •  •  ,k  —  2}.  Hence, 
for  the  purpose  of  this  dissertation,  (3.1)  defines  a  Markov  process  of  order  one. 

The  other  model  of  interest  is  the  measurement  model  of  the  sensor  nodes,  and  it 
is  given  by 


z^  =  h{^\w^)  (3.2) 

where  is  node  i’s  measurement  from  the  target  at  time  k,  /i  :  x  M  ^  M  is  a  possibly 

nonlinear  function,  and  is  an  i.i.d.  noise  sequence.  Similar  to  in  (3.1),  we  assume 
that  each  is  identically  distributed,  and  each  is  independent  from  each  other.  Hence 
is  i.i.d.  is  also  assumed  to  have  no  relation  to  the  target  state,  x^.  In  this  dissertation, 
we  consider  only  single-target  tracking  problems,  hence  in  (3.2),  x^  represents  the  state  of 
the  target  at  time  k.  We  use  the  amplitude  sensor  model  in  Section  2.1.1  since  it  models 
a  general  amplitude  measurement  and  can  be  applied  to  many  different  types  of  signals. 
Hence  (3.2)  follows  (2.2)  as 


4  =  ,  t  +"■?  (3-3) 

where  a  is  the  amplitude  of  the  signal  emitted  from  the  target  and  it  is  assumed  to  be 
known.  In  (3.3),  is  the  location  of  node  i,  and  it  is  also  assumed  to  be  known.  Note  that 
the  measurement  model  in  (3.3)  is  nonlinear  and  the  motion  model  in  (3.1)  could  also  be 
nonlinear. 
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3.1.2  Tracking  Algorithm 

In  a  tracking  problem,  our  purpose  is  to  estimate  the  probability  density  function 
(pdf)  that  the  target  is  in  state  x^,  based  on  measurements  =  {z^,i  = 

1,  •  •  •  ,  k).  We  have  two  stages  at  each  time  step  k:  prediction  and  update.  Suppose  that 
the  pdf  is  available.  In  the  prediction  stage,  we  obtain  the  prior  pdf  of  x^ 

via  the  Chapman-Kolmogorov  equation  [2] 

p{^^\z^'-’^-^)  =  j  (3.4) 

In  (3.4),  the  probabilistic  model  of  p(x^|x^“^)  is  defined  by  (3.1)  and  the  known 
statistics  of  Moreover,  in  (3.4),  the  pdf  p(x^“^|z^'^“^)  is  known  from  the  previous 

time  step,  k  —  1.  Hence  in  (3.4)  we  obtain  p(x^|2:^'^“^),  the  belief  that  the  state  at  time  k 
is  x^  given  past  measurements. 

As  mentioned  in  the  previous  section,  models  the  unpredictable  disturbances 
of  the  target  motion  [4].  Although  is  a  random  variable,  its  statistics  have  to  be  known 
in  order  to  derive  solutions  to  the  tracking  problems. 


Our  eventual  goal  is  to  estimate  p(x^|z^-^),  hence  we  need  the  update  stage. 


The  update  stage  starts  as  the  measurement  z^  becomes  available.  We  may  obtain 
the  desired  p(x^|z^-^)  using  Bayes  rule  as 


p{x^\z^'-^) 

where  the  normalizing  constant 


p{z^\x.^)p{'X.^\z^'^  1) 
p{z^\z^-^~^) 


(3.5) 


p{z^\z^'-'^-^)  =  J  (3.6) 

depends  on  the  likelihood  function  p{z^\x.^).  The  likelihood  function,  p{z^\x.^),  is  deter¬ 
mined  by  the  measurement  model  and  the  known  statistics  of  in  (3.2).  The  other  term 
in  (3.6),  p(x^|2:^'^“^),  is  already  obtained  using  (3.4).  Hence  we  can  calculate  p(2;^|z^'^“^)  in 
(3.6).  Similarly,  when  we  look  at  (3.5),  we  have  the  likelihood  function  p{z^\x.^)  from  (3.2), 
p(x^|2;^'^“^)  from  (3.4),  sin.dp{z^\z^'-^~^)  from  (3.6),  so  the  desired  pdf  p(x^|2;^'^)  is  obtained. 
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The  recursive  prediction  and  update  relations  of  (3.4)  and  (3.5)  form  the  optimal 
Bayesian  solution  to  the  tracking  problem.  They  are  only  a  conceptual  solution  in  general 
since  they  usually  cannot  be  determined  analytically  [2].  Solutions  do  exist  under  strict 
conditions,  which  we  will  cover  in  the  next  section. 

Note  that  the  initial  pdf  p{^\z^)  =  p(x^)  is  also  assumed  to  be  given.  In  our 
simulations,  p(x'^)  is  a  given  unit-variance  Gaussian  distribution  centered  at  x'^. 


Kalman  Filter 

There  exists  an  optimal  solution  to  the  tracking  problem,  the  Kalman  filter  [38,  39] , 
which  was  derived  based  on  (3.4)  and  (3.5).  The  Kalman  filter  assumes  that  the  system 
models  are  Gaussian  and  linear  [38,  39].  That  is,  (3.1)  becomes 

(3.7) 

where  F^  is  a  known  system  matrix.  Meanwhile,  (3.2)  becomes 

=  H'=x'=  +  (3.8) 

where  is  also  a  known  measurement  matrix^.  and  are  zero-mean  Gaussian 

noises  and  have  covariances  of  and  R^,  respectively.  Also,  v  is  independent  from  n. 

Note  that  is  different  from  the  in  (3.2)  because  is  strictly  zero-mean  Gaussian 

with  known  covariance  (for  details  of  the  derivation  of  the  Kalman  filter  please  refer  to 
Appendix  A). 

Under  such  Gaussian  and  linear  conditions,  the  Kalman  filter  can  be  derived  as  a 
recursive  relationship  involving  F^,  and  the  statistics  of  and  n^.  Relation  of  the 
Kalman  filter  to  our  target  tracking  problem  will  be  presented  in  the  following  sections. 

The  restrictions  of  Gaussianity  and  linearity  of  the  Kalman  filter  algorithm  is  of¬ 
ten  impractical.  Hence  there  exists  two  major  types  of  nonlinear  tracking  algorithms:  the 
extended  Kalman  filter  (EKF)  [35]  and  partiele  filters  [18].  They  are  also  referred  to  as 


^We  observe  that  our  measurement  model,  eq.  (3.2),  cannot  be  written  in  this  way 
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suboptimal  algorithms  since  the  Kalman  filter  is  the  optimal  solution  under  Gaussian  and 
linear  assumptions. 


Extended  Kalman  Filters 

The  EKF  [35]  allows  the  motion  model  in  (3.1)  and  the  measurement  model  in 
(3.2)  to  become  nonlinear.  The  concept  of  EKF  is  to  use  local  linearization  of  the  equations, 
(3.1)  and  (3.2).  In  the  EKF  algorithm,  we  use  the  notation  of  to  denote  the  expected 

value  m  at  time  ti  based  on  the  measurement  that  was  obtained  at  time  t2-  Assuming 
that  the  expected  value  of  the  target  position,  and  the  covariance  of  its  error, 

pfc-i|fc-i^  are  both  given  from  the  previous  time  step  {t  =  k  —  1),  and  the  pdf  at  t  =  A:  —  1 
is  Gaussianly  distributed 

^  AA(x^-^  pfc-i|fc-i)  (3  9) 

where  AA(x,  m,  P)  denotes  a  Gaussian  distribution  of  variable  x  whose  mean  is  m  and  whose 
covariance  is  P.  The  EKF  solution  to  the  tracking  problem  can  be  shown  to  be  [2] 

p(x^lzi^^-i)  AA(x^  P^'l^-^)  (3.10) 


where 


1  _  q/c— 1  _j_  p/cp/c— l|/c— 1 

^k\k  ^  ^k\k-i  ^  j^k 


-pklk  _  pfc|fc— 1  _  jjfepfe|fe— 1 


(3.11) 


(3.12) 

(3.13) 

(3.14) 


(3.15) 
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and  where  f^(-)  and  h^(-)  are  the  nonlinear  functions  from  (3.1)  and  (3.2),  and  H  are 
local  linearizations  of  these  nonlinear  functions 


(3.16) 

the  Jacobian  of  f^(x); 

_  dh’^ix) 

dx  —  l 

(3.17) 

the  Jacobian  of  h^(a:);  and  where 

gfc  ^  f^kpk\k-i 

(3.18) 

_ pfc|fc— 1 

(3.19) 

Particle  Filters 

Particle  filter  algorithms  [26,  34,  7,  15],  known  collectively  as  Sequential  Monte 
Carlo  algorithms  [18],  represent  the  density  function  by  a  set  of  particles.  The  particle 
filter  algorithm  described  in  [26]  was  demonstrated  to  have  better  performance  than  the 
EKF,  hence  that  algorithm  was  chosen  for  use  in  our  dissertation. 


In  the  particle  filter  algorithm,  the  key  idea  is  to  represent  the  required  density 
function  by  a  set  of  particles.  A  particle,  (x(i),  q{i)),  has  two  components:  a  sample  of  the 
state  vector,  x(i),  and  each  sample  is  also  associated  with  a  weight,  q{i).  To  initiate  the 
particle  filter  algorithm,  Ng  samples  x®(i),  i  =  1,  •  •  •  ,  Ng  are  drawn  from  the  known  prior, 
p(x°).  Each  sample  is  associated  with  a  weight,  q^{i),  and  Q^{i)  =  1-  Then  the  desired 
density  at  iteration  k  can  be  approximated  as 

Ns 

(3-20) 

i=l 

which  is  a  discrete  weighted  approximation  of  the  desired  pdf  and  (5()  is 


(3.21) 


Recall  that  in  a  tracking  problem,  we  have  two  stages:  prediction  and  update.  We 
now  explain  the  particle  filter  algorithm  proposed  by  Gordon  et  al.  [26]  at  these  two  stages. 
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•  Prediction 

Recall  that  for  a  target  tracking  problem,  we  are  given  two  models:  the  system 
model  in  (3.1)  and  the  measurement  model  in  (3.2).  At  the  prediction  stage,  each  sample, 
x^“^(i),  is  passed  through  the  system  model  in  (3.1)  to  obtain  a  corresponding  new  sample, 
x^*(i)  [26] 


x"‘(i)  =f"(x"-i(z),v"-i(i)),  i  =  l,--  -  ,iV,  (3.22) 

where  v^“^(i)  is  a  sample  drawn  from  the  given  pdf  of  the  noise  p(v^“^)  in  (3.1).  Intuitively, 
this  is  a  fast  way  of  obtaining  the  new  samples  at  t  =  k  since  the  system  model  (3.1)  and 
the  statistics  of  the  noise  are  both  known. 


•  Update 


At  the  update  stage,  recall  that  the  sensor  node  will  make  a  measurement  of  the 
current  target  state.  On  receipt  of  the  measurement  we  evaluate  the  likelihood  of  each 
sample,  p{z^\x^  (i)),  based  on  the  known  measurement  noise  properties  in  (3.2).  Then,  it 
can  be  derived  (see  [2])  that  we  can  calculate  the  weights  at  t  =  k,  q^{i),  according  to 


/(i)  = 

Ef=iP(^"|x"*(j)) 

where  the  denominator  is  the  normalization  to  ensure  9^(^) 
the  particle  filter  algorithm  in  Table  3.1. 


(3.23) 

1  [26].  We  summarize 
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Table  3.1:  Summary  of  the  particle  filter  algorithm  [2] 


Given: 

A  set  of  particles  and  measurement 


Objective : 

Calculate  the  updated  particles  {x^(i), 


Algorithm: 

1.  (prediction) 

for  i=l:A^s  { 

Generate  a  new  sample  by  passing  the  old  sample  through  the  system  model 
x^*(i)  =  f^(x^“^(z),  v^“^(z)) 

} 

2.  (update) 


3. 


for  i=l:A^s  { 

Calculate  q^{i)  = 

} 

(resampling) 

Resample  (i) ,  {i)}^i 

to  obtain  {x^(i), 


using  the  resampling  algorithm  in  Table  3.2 


Using  (3.22)  and  (3.23),  we  now  have  a  set  of  particles  (x^  (i),  (/^(i)),  i  =  1,  •  •  •  ,Ns. 
However,  the  update  stage  is  not  complete  yet.  There  is  a  “degeneracy  phenomenon”  which 
is  a  common  problem  with  particle  filter  algorithms  [48].  The  degeneracy  problem  occurs 
when  after  a  few  iterations,  all  but  one  sample  will  have  negligible  weights  [2].  To  avoid 
the  degeneracy  problem,  Gordon  et  al.  propose  to  perform  a  resampling  procedure  [26].  In 
the  resampling  procedure,  we  are  given  (x^*(i), g^(i)),  i  =  I,--  -  ,Ns,  and  the  output  is  a 
new  set  of  particles  (x^(z),  ^),  i  =  1,  •  •  •  ,Ns.  After  the  resampling  procedure,  the  update 
stage  is  complete.  Note  that  x^*(i),  i  =  1,  •  •  •  ,Ns  are  only  intermediate  samples  (hence  the 
asterisk  sign).  Rather,  the  new  x^(i),  i  =  1,  •  •  •  ,Ns  (after  resampling)  are  what  will  be  used 
and  passed  on  to  the  next  time  step. 


•  Resampling 

The  basic  idea  of  resampling  is  to  eliminate  samples  that  have  small  weights  and 
concentrate  on  samples  with  large  weights  [2].  We  begin  by  building  a  cumulative  dis¬ 
tribution  function  (CDF)  of  q^{j),  and  we  denote  it  as  cj,  j  =  I,--  -  ,Ns.  We  will  also 
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build  another  CDF  of  a  uniformly  distributed  random  variable,  and  we  denote  it  as  ti, 
i  =  1,  •  •  •  ,Ns.  After  we  have  these  two  CDF  functions  (both  of  equal  length  Ng),  we  begin 
at  the  bottoms  of  them,  i  =  1  and  j  =  1-  Next  we  will  iterate  through  Cj.  As  we  traverse 
along  Cj,  we  compare  Cj  with  ti.  There  are  two  cases: 

1.  If  Cj  >  ti,  we  will  set  x^(i)  =  x^*(j)  and  keep  incrementing  i  (while  setting  x^(i)  = 
x^*(j))  until  Cj  <  ti- 

2.  If  Cj  <  ti,  we  will  skip  this  x^*(j). 

In  other  words,  the  resampling  procedure  is  saying  that  when  we  encounter  a  weight,  q^{j), 
that  is  larger  than  that  which  would  result  from  a  uniform  distribution  {cj  >  ti),  we  will 
make  multiple  copies  of  the  x^*(j),  corresponding  to  the  same  q^{j),  to  multiple  x^(i)  (by 
incrementing  i  until  Cj  <  ti  again).  When  we  encounter  a  weight,  q^{j),  that  is  less  than 
what  a  uniform  distribution  would  produce  {cj  <  U),  the  x^  (i)  corresponding  to  q^(i)  is 
skipped.  This  follows  the  basic  idea  of  resampling. 

Readers  interested  in  other  variations  of  the  particle  filter  algorithms  should  refer 
to  [18]  and  the  references  therein. 

The  particle  filter  algorithm  in  Table  3.1,  in  essence,  is  a  process  of  producing  the 
pdf  p{'x}^\z^)  given  the  previous  pdf,  p(x^|z^“^),  and  the  current  measurement  z^.  At  every 
time  step,  only  one  measurement  (of  the  target  state)  is  required.  Hence  in  a  sensor  network 
scenario,  it  is  feasible  to  only  activate  one  sensor  node  to  make  the  measurement.  This  is 
the  beauty  of  target  tracking  algorithms.  Armed  with  the  known  models  in  (3.1)  and  (3.2), 
and  also  the  known  statistics  of  the  noises  in  them,  target  tracking  algorithms  produce  an 
estimate  of  the  current  target  location,  based  only  on  one  measurement. 
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Table  3.2:  Procedure  for  resampling  from  the  particles  [2] 

Given: 

A  set  of  particles  {x^*(i), 

Objective : 

Calculate  the  new  particles  {x^(i), 

Algorithm: 

(1) 

Initialize  the  Cumulative  Distribution  Function(CDF)  of  q^{j):  ci  =  0 
for  j  =  2:  Ns  { 

Construct  the  CDF:  cj  =  cj-i  +  q^{j) 

} 

(2) 

Initialize  another  CDF  of  a  uniformly  distributed  random  variable: 
Draw  a  value:  =  U  0,  ^ 

for  i  =  2  :  Ng  { 

U  =  ti  +  j^{i  —  1) 

} 

(3) 

Start  at  the  bottom  of  the  CDF  of  q^{j):  j  =  1 
Start  at  the  bottom  of  the  CDF  of  t:  i  =  1 
for  (i  =  1  :  Ng)  { 

while  {ti  >  Cj)  { 

j  =  j  +  1; 

} 

Assign  sample:  x^(i)  =  x^*(j) 

Assign  weight:  q^{i)  = 

} 
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For  better  illustration,  we  also  provide  a  toy  example  of  the  particle  filter  process 
in  Figure  3.1.  We  illustrate  this  example  in  a  two-dimensional  space.  In  Figure  3.1(a),  we 
are  at  time  step  /c  =  1,  and  we  have  a  set  of  q^{i)  and  x^(i),  where  i  =  1,  2, . . . ,  5.  Note  that 
in  Figure  3.1(a),  the  sequence  order,  i  =  1,2, ...  ,5,  for  both  q^{i)  and  x^(i)  is  determined 
at  the  initialization  of  the  particle  filter,  and  will  not  be  altered  at  any  later  time  step. 
In  Figure  3.1(b),  we  will  calculate  new  samples  and  denote  them  as  x^*(i),  i  =  1,2, ...  ,5. 
Note  that  the  weights  q^{i)  are  unchanged  from  Figure  3.1(a)  to  Figure  3.1(b). 

At  the  update  stage,  we  will  recalculate  the  new  weights,  q^{i)-  Note  that  from 
Figure  3.1(b)  to  Figure  3.1(c),  the  samples,  x^*(i),  are  unchanged.  Finally,  after  the  update 
stage,  we  will  perform  a  resampling  step  in  Figure  3.1(d).  After  resampling,  all  weights  are 
set  to  equal  to  where  W  =  5  in  this  toy  example,  and  Ng  =  500  in  our  simulations  in 
Chapter  6. 
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q'(2)=02 


Figure  3.1:  Illustration  of  the  particle  filter  process.  In  (a),  we  are  given  a  set  of  samples 
x^(z)  and  weights  In  this  toy  example,  we  set  Ng  =  5.  From  (a)  to  (b)  is  what  we  call 

the  update  stage.  We  can  use  (3.22)  to  calculate  the  new  samples  in  (b).  Note  that  from  (a) 
to  (b),  the  weights  are  unchanged.  From  (b)  to  (c),  we  calculate  new  weights  using  (3.23). 
Note  that  from  (b)  to  (c),  the  samples  are  unchanged.  Finally,  to  overcome  the  degeneracy 
problem,  we  perform  a  resampling  to  obtain  the  new  set  of  weights  q‘^{i)  and  samples  x^(i). 
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From  Figure  3.1(c)  to  Figure  3.1(d),  the  resampling  process  is  not  shown.  We 
present  an  illustration  of  the  resampling  process  in  Figure  3.2.  We  use  the  weights  to  build 
a  cumulative  distribution  function,  Q,  in  Figure  3.2(a).  We  can  also  build  a  cumulative 
distribution  function  for  a  uniform  random  variable,  and  we  denote  it  as  T  in  Figure  3.2(b). 
At  the  beginning,  the  index  for  Q,  j,  and  the  index  for  T,  i,  are  both  set  to  1.  We  then 
increment  i  and  j  based  on  the  resampling  algorithm  in  Table  3.2  as  follows: 

1.  i  =  1,  j  =  1,  Cl  =  0.05  <  ti  =  0.2 

2.  i  =  1,  j  =  2,  C2  =  0.2  >ti  =  0.2,  x2(l)  =  x2*(2) 

3.  i  =  2,  j  =  2,  C2  =  0.2  <t2  =  0.4 

4.  i  =  2,  j  =  3,  C3  =  0.7  >t2  =  0.4,  x2(2)  =  x2*(3) 

5.  i  =  2,,  j  =  3,  C3  =  0.7  >h  =  0.6,  x2(3)  = 

6.  i  =  4,  j  =  3,  C3  =  0.7  <  ^4  =  0.8 

7.  i  =  4,  j  =  4,  C4  =  0.9  >  ^4  =  0.8,  x^(4)  =  x^*(4) 

8.  i  =  5,  j  =  4,  C4  =  0.9  <  ts  =  1.0 

9.  i  =  5,  j  =  5,  cs  =  1.0  >  ts  =  1.0,  x^(5)  =  x^*(5) 

After  the  above  procedure,  the  resampling  is  complete.  The  new  set  of  sample  and 
weight  pairs  are  shown  in  Figure  3.1(d).  The  basic  idea  of  resampling  can  be  seen  in  this 
toy  example,  where  x^*(l)  is  ignored  because  (?^(1)  =  0.05  is  too  small.  On  the  other  hand, 
x^*(3)  is  duplicated,  because  q^{3)  =  0.5  is  larger. 

The  new  set  of  samples,  {x^(i)},  i  =  1, . . . ,  5,  will  be  passed  to  (3.22)  to  begin  the 
next  iteration  of  particle  filters. 

To  illustrate  the  performance  of  the  particle  filter,  we  provide  an  example  tracking 
problem  where  the  measurement  model  is  nonlinear  in  Appendix  B. 


31 


(a) 


T 


Figure  3.2:  Illustration  of  the  resampling  process.  After  we  build  two  cumulative  distribu¬ 
tion  functions,  Q  in  (a)  and  T  in  (b),  we  begin  with  i  =  1  and  j  =  1.  As  we  increment  j,  we 
compare  c{j)  and  t{i).  If  c{j)  >  t(z),  we  will  increment  i  until  t{i)  >  c{j)  again.  Every  time 
we  increment  i,  we  will  set  x^(i)  =  x^*(j)  and  q^{j)  =  On  the  other  hand,  if  c(j)  <  f(i), 
we  will  do  nothing  except  incrementing  j  until  c{j)  >  t{i)  again. 
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3.1.3  Collaborative  Tracking  Using  Sensor  Networks 

For  sensor  networks,  Liu  et  al.  [47]  propose  a  special  treatment  for  the  tracking 
problem.  At  any  time  k,  only  one  sensor  node  will  be  activated  to  perform  the  tracking 
task.  This  strategy  is  referred  to  as  leader-based  tracking.  We  will  not  reproduce  all  of  the 
details  in  [47]  here.  Instead,  we  describe  how  the  tracking  algorithm  is  performed  and  how 
to  activate  the  sensor  node  in  the  next  time  step. 

Liu  et  al.  [47]  assume  that  sensor  nodes  are  resource-limited,  hence  the  tracking 
task  itself  is  done  by  approximating  the  two  equations  in  (3.4)  and  (3.5)  numerically  by 
nonparametric  representations  of  p{'x.^\z^).  This  approach  may  reduce  the  computing  cost 
on  the  sensor  nodes,  however,  tracking  accuracy  is  not  guaranteed.  Instead  of  restricting 
the  capability  of  the  nodes,  in  this  dissertation  we  will  instead  use  the  particle  filter  tracking 
algorithm  which  will  perform  at  least  as  well  at  the  cost  of  computing  resources. 


As  for  the  selection  of  sensor  node,  Liu  et  al.  [47]  propose  to  use  mutual  information 
to  select  the  next  sensor  node  to  activate.  Let  U  and  V  be  two  random  variables  having  a 
joint  pdf  p{u,v).  The  mutual  information  between  U  and  V  is  defined  as 


I{U-,V)  =  E 


■  p{u,v) 
%{u)p{v) 


(3.24) 


At  time  step  k  —  1,  Liu  et  al.  [47]  propose  to  select  the  sensor  node  s  that  maximizes  the 
mutual  information  between  the  next  target  location,  x^,  and  the  new  measurement 


s  =  argmax  ^^z^lz^  ^) 

s&S 


(3.25) 


where  S  is  the  collection  of  sensors  that  the  current  node  can  talk  to,  namely  the  current 
node’s  neighborhood.  Using  the  definition  of  mutual  information  in  (3.24),  we  have 


Isi^  \z‘ 


k\  ^k-1.  k\  ^k-1 


\Z  \Z 


=  E 


log 


p{x^,z^\z^  ^) 
p{x^\z^~^)p{z^\z^~^) 


(3.26) 


In  summary,  the  node  activation  algorithm  is  that  at  time  step  k  —  1,  we  select  a  collection 
of  sensor  nodes,  S,  as  the  candidates  to  be  activated  in  the  next  time  step,  k.  For  each  node 
in  S,  we  will  compute  the  mutual  information  according  to  (3.26).  The  node  in  S  with  the 
highest  mutual  information  will  be  the  next  node  to  be  activated  at  step  k. 
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Let  us  look  at  how  to  compute  (3.26)  in  detail.  Recall  that  in  the  prediction  stage 
of  the  particle  filter  algorithm,  we  pass  the  set  of  samples  through  the  known  motion  model 
in  (3.1).  In  a  similar  manner,  at  time  step  k  —  1,  the  term  can  be  “predicted” 

by  passing  the  current  through  the  motion  model  in  (3.1). 

The  predicted  p{ii.^\z^~^)  can  be  represented  by  a  discrete  approximation,  say, 
a  set  of  particles.  Recall  from  the  previous  section  that  a  particle  has  two  components: 
(x(z), g(i)),  i  =  1, ...  ,Ns.  The  set  of  x(i),  i  =  1, . . . ,  is  the  samples  of  the  target  state. 
We  also  choose  a  set  of  real  values  to  quantify  the  possible  range  of  the  sensor  measurement 
z^ .  For  example,  we  can  use  [1,2,...,  Mg]  to  quantify  the  range  of  the  possible  values  of 
z^ .  We  denote  this  set  as  ,  and  Z^  G  M.  Then  for  every  element  in  the  set  of  x(i), 
i  =  1 , . . . ,  W  and  for  every  element  in  the  set  Z^ ,  we  can  calculate 

p(x^  =  p{z'^\^^)p{^^\z^-^)  (3.27) 

We  already  have  the  second  term  in  (3.27),  p{'xf^\z^~^),  by  passing  the  current  p{'x!^~^\z^~^) 
through  the  motion  model.  The  first  term  in  (3.27),  p{z^\'xf^),  can  be  calculated  using  the 
known  statistics  of  the  measurement  noise  in  (3.3).  Note  that  if  we  have  Ng  samples  x(i), 
and  there  are  Mg  values  in  Z^ ,  then  p(x^,  z^\z^~^)  in  (3.27)  is  a  W  x  Mg  matrix. 

After  we  have  the  joint  density  function  p(x^,  z^\z^~^),  we  can  calculate  the  partial 
density  function  p{z^\z^~^)  by 

p{z%^-^)  =  ^  p(x^  z^\z^-^)  (3.28) 

X*(j) 

Note  that  p{z^\z^~^)  is  now  a  vector  of  length  Mg. 


Now  that  we  have  p(x^ I ^),  p(x^,  and  p{z^\z^  ^), 

Ng  X  Mg  matrix 

^p(x^  I  z^~^)p{z^  I  z^~  ^  ) 

and  the  mutual  information  can  be  calculated  using 


we  can  calculate  the 
(3.29) 


x^(i) 


(3.30) 
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Note  that  the  scalar  value  of  Ig  is  calculated  for  every  sensor  node  in  the  collection  S.  The 
node  with  the  maximal  Ig  will  be  activated  in  the  next  time  step. 

We  follow  this  node  activation  scheme  as  part  of  our  secure  tracking  algorithms. 
However,  if  we  only  activate  only  one  sensor  node  at  a  time,  the  sensor  network  is  susceptible 
to  attacks.  Once  the  active  node  is  replaced  with  a  malicious  node,  the  tracking  result  will 
be  inaccurate.  Let  us  consider  the  following  security  problem  in  target  tracking. 

3.2  Security  in  Tracking 

3.2.1  Problem  Statement 

Consider  the  case  when  we  have  deployed  a  sensor  network,  and  all  of  the  node 
positions  have  been  determined  and  calibrated.  At  each  time  step,  one  or  more  sensor 
nodes  will  be  activated,  depending  on  battery  resources  and  security  needs.  Following  [47], 
we  assume  that  sensor  nodes  have  adequate  processing  power,  and  can  calculate  the  cur¬ 
rent  pdf  p{'x.^\z^)  after  making  measurements  of  their  ranges  to  the  current  target  location. 
The  current  pdf  is  reported  to  the  central  processor.  The  central  processor  will  select  the 
next  sensor  node(s)  to  activate,  and  send  p{'x.^\z^)  to  the  next  node(s)  in  order  to  calculate 
In  the  sensor  network,  an  unknown  number  of  the  nodes  are  malicious  and 
attempt  to  lower  the  accuracy  of  target  tracking.  The  problem  of  secure  tracking^  as  de¬ 
fined  in  this  dissertation,  is  to  correctly  estimate  the  current  target  location  by  removing 
malicious  nodes. 


3.2.2  Problem  Definition 

Our  objective  is  to  detect  the  malicious  nodes  in  the  network  during  the  tracking 
process.  The  basic  assumptions  given  in  Section  2.4  about  the  objective  and  behavior  of 
the  malicious  nodes  still  hold.  At  any  time  step,  all  the  malicious  nodes  will  agree  that 
the  current  target  location  is  the  fictitious  event  location  in  Section  2.4,  and  we  refer  to  it 
as  fictitious  target  location.  In  other  words,  at  any  time  step,  all  the  malicious  nodes  will 
report  a  pdf  p(x^|z^)  as  if  the  target  is  at  the  fictitious  target  location.  If  we  connect  the 
fictitious  locations  over  time,  we  form  a  fictitious  path.  We  illustrate  a  fictitious  path  in 
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Figure  3.3.  The  fictitious  path  will  mislead  our  defense  mechanism  and  allow  the  enemy  to 
avoid  detection.  The  colluding  behavior  is  also  another  feature  that  distinguishes  malicious 
nodes  from  ordinary  malfunction  nodes. 


In  this  section,  localization  is  not  performed  at  each  time  step,  hence  the  discrep¬ 
ancy  described  in  (2.13)  no  longer  exists.  Also,  in  tracking  we  do  not  activate  as  many 
nodes  as  in  localization  at  each  time  step.  In  localization,  there  are  multiple  nodes  active 
at  one  moment  in  time.  In  tracking,  only  one  node  is  necessary  since  the  motion  model, 
the  measurement  model  and  past  tracking  history  are  known  (we  will  activate  more  than 
one  node  in  our  secure  tracking  algorithm  in  Chapter  6). 


What  is  notably  different  from  Section  2.4  is  that  in  Section  2.4,  each  sensor  node 
only  reports  a  range  estimate  6i.  Here  in  target  tracking,  each  sensor  node  can  independently 
estimate  the  current  target  location  using  particle  filters.  We  summarize  what  a  particular 
sensor  node  does  at  each  time  step  k  as  follows^: 

1.  The  sensor  node  is  given 

2.  Next,  the  sensor  node  makes  a  measurement  of  the  target  using  the  measurement 
model  given  in  (3.2).  Hence  is  obtained. 

3.  Using  z^  and  p{'x.^~^\z^~^),  calculate  the  current  belief  p{'x.^\z^) 

Based  on  the  current  belief  p(x^|2;^)  from  the  sensor  node,  we  estimate  the  current 
location  of  the  target  using  the  posterior  mean 

=  J  xV(x^|z^)dx'=  (3.31) 

or  the  posterior  max 

=  argmax  p{x^\z^)  (3.32) 

From  this  point  on,  we  will  use  the  expected  value  as  in  (3.31)  to  illustrate  the  estimated 

current  location  of  the  target  in  the  figures.  However,  between  nodes  and  the  central  server, 

the  entire  density  function  p{'x!^\z^)  is  transmitted. 

^Note  that  since  we  are  describing  the  behavior  of  one  node,  we  omit  the  subscript  denoting  which  node 
this  is 
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Figure  3.3:  We  have  activated  6  sensor  nodes  consecutively,  which  are  denoted  as  0  through 
5.  The  lower  path  is  the  true  target  path;  while  the  upper  one  is  fictitious.  In  this  scenario, 
two  malicious  nodes  report  that  the  target  is  most  likely  on  the  upper  path,  i.e.  Nodes 
3  and  5  are  malicious.  Nodes  0,  1,  2  &  4  report  correctly  the  range  to  the  actual  target 
position,  lying  on  the  true  path. 


3.2.3  Supplemental  Properties 
Discrete  time  steps 

We  assume  that  time  is  discrete  in  this  dissertation.  Also,  the  measurement  process 
is  fast  enough  that  at  the  same  time  step,  the  target  is  (relatively)  stationary,  from  the 
beginning  to  the  end  of  the  measurement  process. 


Sensor  measurements 

At  each  time  step,  we  may  activate  more  than  one  sensor  node,  if  necessary.  For 
those  nodes  active  at  the  same  time,  we  assume  that  their  clocks  are  synchronized. 

Fictitious  target  location 

Similar  to  Section  2.4,  the  fictitious  target  location  has  to  fall  within  the  sensing 
range  of  all  the  active  nodes.  Otherwise,  the  benign  nodes  can  simply  tell  which  node  is 
lying. 


Centralized  model 

Although  sensor  nodes  can  perform  tracking  independently,  the  current  pdf  will 
be  collected  by  the  central  processor.  The  central  processor  also  determines  which  node(s) 
to  activate  in  the  next  time  step. 
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No  communication  delay 

We  do  not  consider  any  communication  delay  between  the  sensor  nodes  and  the 
central  processor.  So  even  if  the  target  is  moving,  we  can  still  keep  track  of  it  using  the 
tracking  algorithm. 
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Chapter  4 

Relaxation  Labeling 


The  problems  defined  in  Section  2.4  and  Section  3.2  are  critical  to  a  secure  sensor 
network  environment.  Before  we  could  solve  these  two  problems,  however,  we  need  to  re¬ 
view  the  relaxation  labeling  algorithm  which  will  be  the  foundation  of  our  solution  in  the 
next  chapters. 

Relaxation  labeling  was  proposed  in  the  seminal  paper  of  [51]  in  the  area  of  com¬ 
puter  vision  -  teaching  computers  to  recognize  the  content  of  digital  images.  The  original 
purpose  of  designing  the  relaxation  labeling  algorithm  was  to  reduce  the  ambiguity  in  iden¬ 
tifying  the  objects  in  a  scene.  Given  the  different  relationships  among  the  objects  in  the 
scene,  the  relaxation  labeling  algorithm  will  give  a  higher  weight  to  a  more  likely  solution 
and  vice  versa.  The  ambiguity  is  said  to  be  removed  when  the  objects  in  a  scene  can  each 
be  uniquely  labeled  with  a  consistent  label. 

Reducing  ambiguity  is  also  important  in  many  other  areas  of  science.  For  example, 
shape  and  stereo  matching,  image  enhancement  (edges,  features)  and  high-level  interpre¬ 
tation  of  image  content  [42]  all  can  be  posed  as  problems  of  reducing  ambiguity.  Over  the 
years,  relaxation  labeling  has  become  a  prominent  algorithm  in  solving  them. 

Let  us  begin  with  a  toy  example  in  image  content  interpretation.  In  Figure  4.1, 
we  have  an  image  which  has  three  objects.  Suppose  for  now  that  we  have  used  some  image 
segmentation  algorithm  to  obtain  this  image.  Based  on  the  result,  we  found  three  objects 
inside  this  image.  Assume  that  we  somehow  know  that  one  of  them  is  a  circle,  one  of 
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them  is  a  triangle,  and  one  of  them  is  a  square,  except  that  we  do  not  know  which  is  which. 
How  do  we  correctly  label  the  right  object  inside  the  image  to  have  the  correct  shape(label)  ? 


Figure  4.1:  An  scene  labeling  example  to  illustrate  the  relaxation  labeling  process.  In  the 
scene  we  have  three  objects:  object  1  is  a  circle,  object  2  is  a  square  and  object  3  is  a 
triangle. 


The  solution  lies  in  relaxation  labeling.  Introduced  in  1976  [51],  the  formulation  of 
“nonlinear  relaxation”  presented  here  has  become  the  de  facto  standard  method  for  solving 
consistent  labeling  problems.  We  review  the  method  in  this  section  to  ensure  the  reader 
understands  the  method,  before  extending  it  to  triples  and  applying  the  extension  to  sensor 
networks  in  Chapter  5.  Both  the  extension  and  application  are  novel. 

In  relaxation  labeling,  three  key  components  of  the  algorithm  are  defined  [51]: 

•  Pi{X):  the  “confidence”  for  object  i  to  have  label  A 

•  (?i(A):  the  “support”  for  object  i  having  label  A 

•  r{i,  j,  X,  X'):  the  compatibility  function  between  object  i  having  label  A  and  object  j 
having  label  A' 

Let  us  begin  the  explanation  of  relaxation  labeling  with  Pi{X).  Pi{X)  is  the  fi¬ 
nal,  desired  result  that  we  are  seeking.  The  value  of  T’i(A)  will  eventually  determine  the 
labeling  of  each  object.  After  we  introduce  T’j(A),  we  will  introduce  qi{X)  because  Pi{X)  de¬ 
pends  on  qi{X).  Finally,  r{i,j,  A,  A')  will  be  introduced  because  qi{X)  depends  on  r{i,j,  A,  A'). 
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Each  object  i  will  be  assigned  a  different  label  A,  and  we  assume  that  the  types 
of  labels  are  known.  For  example,  in  Figure  4.1,  there  are  three  labels:  circle,  triangle 
and  square.  The  confidence  that  one  can  say  object  i  has  label  A  is  called  Pi{\)  [51]. 
Pi{\)  is  a  number  between  0  and  1,  and  the  higher  it  is,  the  more  confidence  we  have 
in  saying  that  object  i  is  correctly  labeled  as  A.  Again  using  our  example  in  Figure 
4.1,  we  want  Pi{circle)  1  over  the  iterations.  Since  we  assume  the  labels  are  mutu¬ 
ally  exclusive  and  collectively  exhaustive,  Ylj  ^  any  iteration)  [51].  Hence 

Pi{circle)  +  Pi{square)  +  Pi{triangle)  =  1  in  Figure  4.1.  In  general,  a  relaxation  labeling 
process  works  with  a  set  of  labels  A  =  {Ai,  A2,  •  •  •  ,  A^}  and  Pi{\j)  =  1. 


Now  that  we  have  defined  Pi{\)  and  its  properties,  let  us  look  at  how  to  calculate 
it.  Relaxation  labeling  is  an  iterative  process.  For  iteration  t  -|-  1,  P/^^(A)  is  updated  as 
follows  [51] 


Pl{\)  [l  +  g*(A)] 


(4.1) 


j 

In  (4.1),  the  numerator  means  that  the  new  confidence  in  the  next  iteration, 

will  be  equal  to  the  current  confidence,  P/,  multiplied  by  [l  -|-  ?|(A)].  ql{X)  is  the  support 
of  object  i  having  label  A,  as  provided  by  other  labelings,  and  —1  <  g|(A)  <  1  [51].  Hence  if 
the  support  is  larger,  [l  -|-  qj{X)]  should  be  higher,  and  as  a  result  the  confidence  in  the  next 
iteration  will  be  higher.  If  the  support  is  negative,  it  will  drive  the  confidence  Pi{X)  in  the 
next  iteration  down.  We  can  also  see  that  in  (4.1),  the  denominator  acts  as  a  normalization 
factor  that  scales  P/'''^(A)  to  ensure  0  <  Pi{X)  <  1. 


In  (4.1),  if  ql{X)  is  large  (i.e.,  close  to  1),  it  will  eventually  drive  P/'''^(A)  to  be 
close  to  1.  On  the  other  hand,  if  qj{X)  is  negative,  it  will  drive  P^*'’'^(A)  toward  0.  Hence 
ql{X)  is  essential  to  how  we  will  end  up  getting  Pi{X).  Rosenfeld  et  al.  [51]  defines  (?*(A)  as 
follows 


ql{X)  =  Y,C^J 


J]r(i,i,A,A')Rj(A') 


(4.2) 


j  I  y  J 

where  Cij  is  an  optional  coefficient.  The  purpose  of  Cij  is  to  be  a  normalization  coefficient 
to  make  sure  that  —  1  <  ql{X)  <  1.  It  may  also  be  used  to  constrain  the  relationship  be- 
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tween  two  objects.  For  example,  if  object  i  and  object  j  have  no  relation  at  all,  then  Cij  =  0. 

Equation  (4.2)  answers  the  question:  given  object  i  having  label  A,  how  is  that 
compatible  with  all  the  possible  cases  of  the  other  objects  having  different  labels?  For  exam¬ 
ple,  in  Figure  4.1,  if  we  want  to  calculate  qi{circle),  we  need  to  examine  the  compatibilities 
of 


1.  object  1  is  circle  and  object  2  is  triangle 

2.  object  1  is  circle  and  object  2  is  square 

3.  object  1  is  circle  and  object  3  is  triangle 

4.  object  1  is  circle  and  object  3  is  square 

After  we  calculate  the  four  compatibilities,  we  multiply  them  each  with  its  respective  con¬ 
fidence  and  sum  them  up.  We  observe  that  (4.2)  is  a  sum  of  a  function  multiplied  by  a 
“probability”  and  some  readers  may  find  it  helpful  to  think  of  it  as  an  expected  value. 

The  actual  design  of  the  compatibility  function  r(i,  j.  A,  A')  is  problem-dependent. 
It  should  be  a  function  that  returns  a  higher  value  when  both  labeling  object  i  as  A  and 
labeling  object  j  as  A'  make  sense.  Specifically,  in  the  design  of  r{i,j,  A,  A'),  the  return  value 
is  required  to  satisfy 


-  1  <  r(i,  j,A,  A')  <  1  (4.3) 

Using  our  toy  example  in  Figure  4.1,  we  assume  that  we  somehow  know  that  all 
the  objects  in  the  image  are  drawn  with  a  constant  size,  i.e.  the  radius  of  the  circle  and 
the  sides  of  the  triangle  and  square  are  all  d.  Then  using  our  knowledge  of  geometry,  we 
know  that  the  area  of  a  circle  of  radius  d  is  7rd^,  which  is  larger  than  the  area  of  a  square  of 
size  d.  The  area  of  the  square,  is  again  larger  than  the  area  of  a  regular  triangle  of  size 
d,  which  is  '^d?.  Hence  one  way  of  designing  a  compatibility  function  for  the  problem  in 
Figure  4.1  is  to  compare  the  area  of  object  i  and  object  j.  If  we  label  object  i  as  circle  and 
object  j  as  square,  then  the  area  of  object  i  should  be  larger  than  j.  Similarly,  if  we  label 
object  i  as  square  and  object  j  as  triangle,  then  the  area  of  object  i  should  also  be  larger 
than  object  j.  If  we  find  the  area  of  object  i  to  be  much  bigger  than  the  area  of  object 


42 


j,  then  labeling  i  as  circle  and  labeling  j  as  square  makes  sense,  so  r{i,j,  circle,  square) 
should  return  a  positive  value  close  to  1. 

We  give  a  simple  example  design  of  the  compatibility  function  here.  Suppose  that 
we  use  a  certain  image  processing  algorithm  to  estimate  the  area  of  object  i  in  the  image,  and 
we  denote  it  as  Ai.  Similarly,  the  area  of  another  object  j  is  estimated  to  be  Aj.  Following 
our  previous  assumption  that  the  objects  should  be  of  the  same  side  length,  we  know  that  if 
we  label  object  i  as  circle  while  labeling  object  j  as  square,  then  we  are  expecting  Ai  >  Aj. 
We  also  know  that  given  Ai  >  Aj,  the  number  should  be  between  0  and  1  since  both 

Ai  and  Aj  are  positive  numbers.  To  meet  the  requirement  of  returning  a  value  between  —1 
and  1  when  we  design  a  compatibility  function,  we  can  choose  r{i,  j,  circle,  square)  to  be 
2  —  1  when  we  label  i  as  circle  and  j  as  square.  Similarly,  when  we  expect  Ai  >  Aj^ 

we  can  also  use  this  form.  In  the  cases  where  Ai  <  Aj,  we  can  use  the  form  2 —  1. 
Then  one  possible  design  of  the  compatibility  function  can  be 

r  r{i,  j,  circle,  square)  =  r{i,  j,  circle,  triangle)  =  r{i,  j,  square,  triangle)  =  —  1 

^  r{i,  j ,  square,  circle)  =  r{i,  j,  triangle,  circle)  =  r{i,  j,  triangle,  square)  =  2^1^^  —  1 

(4.4) 

It  is  important  to  understand  the  distinction  between  the  terms  “compatible”  and 
“correct”.  It  is  possible  for  a  labeling  of  node  i  by  A  to  be  completely  compatible  with 
labeling  node  j  by  A'  and  both  be  wrong.  We  hope  to  always  set  up  the  iterative  labeling 
problem  so  that  the  most  globally  compatible  labeling  will  also  be  the  correct  one. 

In  Appendix  C,  we  will  explain  why  relaxation  labeling  is  a  “relaxation”  process 
and  derive  how  to  minimize  its  objective  function. 
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Chapter  5 

A  New  Relaxation  Labeling 
Architecture  for  Secure 
Localization 

5.1  Related  Work 

The  secure  localization  problem  was  first  proposed  by  Lazos  et  al.  [43]  who  for¬ 
mulated  the  problem  in  an  a  distributed  system.  Also,  there  is  no  direct  measurement  of 
range  in  [43].  Instead,  the  distance  between  sensor  nodes  are  calculated  by  counting  how 
many  intermediate  hops  between  nodes  there  are,  the  so-called  “hop  counts”.  For  exam¬ 
ple,  if  a  message  is  sent  by  node  1,  and  it  goes  through  node  2  to  reach  node  3,  then  the 
hop  count  is  two.  As  another  example,  if  a  message  is  sent  by  node  1,  and  goes  through 
nodes  2  through  4  to  reach  node  5,  then  the  hop  count  is  4.  Since  our  localization  algo¬ 
rithms  are  centralized  and  range-based,  the  distributed  methods  are  not  furthered  discussed. 

A  recent  work  was  proposed  by  Liu  et  al.  [46] ,  which  uses  a  thresholding  technique 
to  detect  malicious  nodes.  If  n  sensor  nodes  report  the  same  event,  each  with  a  different 
range  estimate  d*,  f  =  1,  •  •  •  ,  n,  we  can  calculate  the  discrepancy  associated  with  each  node 
as  in  (2.13).  Liu  et  at.  [46]  set  a  threshold  on  the  discrepancy,  and  set  those  nodes  with  dis¬ 
crepancies  higher  than  the  threshold  as  malicious  nodes.  The  performance  of  [46]  decreases 
as  the  number  of  malicious  nodes  increases  in  the  network. 
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Statistical  filtering  methods  based  on  authentication  are  proposed  in  [60,  65,  17] 
to  detect  false  range  reports  made  by  malicious  nodes.  In  [60],  when  an  event  occurs,  those 
nodes  which  have  detected  the  event  collectively  calculate  the  location  of  the  event,  and 
endorse  it  with  a  Message  Authentieation  Code  (MAC).  As  a  report  is  forwarded  through 
multiple  hops  toward  the  central  processing  unit,  each  forwarding  node  verifies  the  correct¬ 
ness  of  the  MAC  with  certain  probability.  If  the  probability  of  some  MAC  being  correct  is 
too  low,  the  report  will  be  dropped.  The  probability  of  detecting  incorrect  MACs  increases 
with  the  number  of  hops  it  travels.  The  success  of  the  algorithm  in  [60]  depends  on  the 
design  of  the  MAC  key  for  other  nodes  to  authenticate.  Our  algorithm  is  not  based  on  key 
authentication,  and  Ye  et  al.  [60]  also  does  not  formulate  the  problem  assuming  the  sensor 
nodes  are  colluding. 

Although  we  assume  that  malicious  nodes  can  successfully  authenticate  with  the 
network,  earlier  works  propose  methods  to  securely  distribute  keys  to  nodes  inside  the 
network.  For  example,  Eschenauer  et  al.  propose  a  key  management  scheme  for  key  distri¬ 
bution,  revocation,  re-eying  and  incremental  additions  of  nodes  [22] . 

Our  work  provides  a  different  perspective  to  the  secure  localization  problem  [9, 
11].  We  propose  a  new  relaxation  labeling  architecture  to  detect  colluding  malicious  nodes 
first  [9].  After  removing  malicious  nodes,  we  can  use  only  data  from  benign  nodes  to  perform 
localization  [11]. 

5.2  The  New  Relaxation  Labeling  Architecture 

We  assume  that  the  sensor  nodes  are  densely  deployed,  i.e.,  for  every  event  there 
will  be  n  nodes  that  have  detected  it,  and  n  >  3.  Besides,  as  illustrated  in  Figure  2.1,  only 
three  nodes  are  needed  to  localize  the  event.  We  will  denote  a  set  of  3  nodes  as  a  triple. 

Since  n  >  3,  we  have  (  )  =  ^i.2^3  triples.  The  estimated  event  locations 

from  each  triple  may  vary,  since  there  may  be  1,  2  or  3  malicious  nodes  in  each  set.  If  a 
triple  has  malicious  nodes  in  it,  the  ranges  reported  by  the  3  nodes  will  usually  be  incon¬ 
sistent  (unless  all  3  are  malicious),  and  such  inconsistency  will  be  shown  in  e*,  i  =  1,2,3. 
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We  would  like  to  exploit  such  inconsistency  and  use  relaxation  labeling  algorithms  to  detect 
malicious  nodes.  In  Chapter  6,  we  will  activate  more  than  one  node  at  each  time  step,  and 
the  relationship  between  those  active  nodes  at  successive  time  steps  can  be  used  to  examine 
inconsistency.  More  details  on  the  tracking  part  will  be  described  in  Chapter  6. 

In  classical  relaxation  labeling  algorithms  [51],  the  compatibility  function  is  de¬ 
fined  over  two  objects.  The  idea  of  extending  the  compatibility  function  to  three  or  more 
objects,  the  so-called  “higher-order”  compatibility  functions,  was  first  conceived  in  [31]. 
However,  very  little  literature  has  really  designed  and  implemented  higher-order  relaxation 
labeling.  The  reason  is  explained  in  [21],  as  Eklundh  et  al.  experimented  with  both  clas¬ 
sical  and  higher-order  relaxation  labeling  algorithms  on  edge  enhancement  applications. 
The  conclusion  in  [21]  is  that  the  performance  improvement  gained  from  using  higher-order 
compatibility  functions  is  not  very  significant.  Hence  the  performance  gain  does  not  justify 
the  increased  computing  cost  of  higher-order  compatibility  functions. 

There  does  exist  evidence  showing  that  higher-order  relaxation  labeling  algorithms, 
when  designed  for  certain  applications,  outperform  classical  ones.  One  example  is  graph 
matching  [25,  41].  The  compatibility  function  in  [25]  is  specified  in  terms  of  the  face-units 
of  the  graphs  under  match,  and  it  requires  only  knowledge  of  the  number  of  nodes,  edges 
and  faces  in  the  model  graph.  The  work  in  [25]  shows  that  higher-order  relaxation  label¬ 
ing  algorithms,  when  designed  well,  can  provide  better  performance.  In  this  dissertation, 
however,  it  is  necessary  to  use  higher-order  compatibility  functions  since  localization  of  an 
event  requires  a  triple  of  nodes. 

We  define  label  A  for  each  sensor  node  as  assigning  it  to  be  malicious  or  benign. 
A  “labeling”  is  the  association  of  a  node,  i,  with  a  label.  A,  and  this  association  is  denoted 
by  the  ordered  pair  (i,A).  Specifically,  if  a  node  has  label  A  =  m,  it  is  considered  to 
be  malicious,  while  having  label  X  =  b  denotes  benign.  For  each  sensor  node,  we  define 
a  confidence  P{X).  The  confidence  of  node  i  having  label  A  is  denoted  as  Pi{X).  The 
confidence  PfiXj)  has  probability-like  properties: 

0<P*(A,)<1  yi,j  ^PfiXj)  =  l.  (5.1) 

j 
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Note  that  in  (5.1),  we  only  have  two  labels,  hence  Xj  G  {m,b}.  Following  [51],  we 
iteratively  update  the  confidence  of  node  i  having  label  j  as 


(5.2) 


where  Dj  =  [l  +  <?i(Afc)] ,  Xk  G  {m,  6}  is  a  normalization  required  to  ensure  that 

Pi{Xj)  sums  to  1,  and  t  stands  for  iteration. 


Since  we  need  at  least  3  nodes  to  perform  the  localization  task,  we  define  q\{X)  to 
be  a  measure  of  how  consistent  the  labeling  of  node  i  as  having  label  A  is  with  the  labeling 
of  both  the  other  nodes: 

<'.‘(a)  =  4eEEp.mE  Pfc(A")r(i,j,fc,A,A',A"),  (5.3) 

j  k  y  A" 

where  N  =  (n  —  l)(n  —  2),  n  is  the  number  of  nodes  in  the  network,  j  =  1, ...,  n,  k  =  1, ...,  n, 
j  i,  k  i,  j  k,  and  r{i,  j,  k,  X,  X' ,  X”)  is  a  “compatibility  function”  to  be  defined  and 
explained  later.  Note  that  the  summation  are  over  all  the  nodes,  not  just  a  single  triple. 
The  power  of  ql{X)  is  that  it  considers  the  consistency  of  node  i  having  label  A  with  all  other 
pairs  of  nodes  j,k.  If  (f,A)  is  not  consistent  with  other  labelings,  qf{X)  will  be  negative, 
hence  driving  Pi{X)  down. 


From  (5.1),  it  is  evident  that  we  only  need  to  calculate  one  probability  {Pi{b)  or 
Pi{m))  for  each  node  since  the  two  probabilities  sum  up  to  1.  That  probability  is  further 
dependent  on  q\{Xj)^  as  shown  in  (5.2).  The  computational  complexity  of  q\{Xj)^  as  shown 
in  (5.3),  is  O(n^),  where  n  is  the  number  of  nodes  in  the  network. 

5.2.1  Design  of  the  Compatibility  Function 

To  implement  (5.3),  we  need  to  design  a  compatibility  function  r{i,  j,  k,  X,  X' ,  X”) 
which  will  be  higher  when  nodes  i,j,k  having  labels  A,  A',  A'',  respectively,  is  consistent, 
and  vice  versa.  Specifically,  —1  <  r(-)  <  1.  For  example,  if  we  have  10  nodes,  and  we  are 
examining  nodes  3,  5  and  7.  If  node  3  is  malicious,  and  nodes  5  and  7  are  benign,  then 
r(3,  5,  7,  m,  b,  b)  should  be  close  to  1. 
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For  each  triple  containing  nodes  {i,j,  k),  we  may  calculate  the  respective  discrep¬ 
ancies  €i,  €j,  and  Cfc  by  equation  (2.13).  Define  e  =  e*  -|-  e^-  -|-  as  the  total  discrepancy. 
Each  node  of  the  triple  could  be  either  benign  or  malicious,  so  we  ought  to  have  2^  =  8 
different  compatibility  functions  for  each  triple 

1.  r{i,j,  k,  b,  b,  b):  node  i  benign,  node  j  benign,  node  k  benign 

2.  r{i,j,  k,  b,  b,  m):  node  i  benign,  node  j  benign,  node  k  malicious 

3.  r{i,j,  k,  b,  m,  b):  node  i  benign,  node  j  malicious,  node  k  benign 

4.  r{i,  j,  k,b,m,m):  node  i  benign,  node  j  malicious,  node  k  malicious 

5.  r{i,j,  k,  m,  b,  b):  node  i  malicious,  node  j  benign,  node  k  benign 

6.  r{i,  j,  k,m,b,m):  node  i  malicious,  node  j  benign,  node  k  malicious 

7.  r{i,  j,  k,m,m,b):  node  i  malicious,  node  j  malicious,  node  k  benign 

8.  r{i,j,k,m,m,m):  node  i  malicious,  node  j  malicious,  node  k  malicious 

Due  to  the  fact  that  the  malicious  nodes  are  colluding,  we  only  need  to  design 
two  types  of  compatibility  functions.  Case  1  and  Case  8  use  one  type  of  the  compatibility 
function.  Case  2,  Case  3,  Case  4,  Case  5  and  Case  6  use  the  other  type  of  the  compatibility 
function.  Let  us  look  at  them  in  detail. 

Case  1  and  Case  8 

We  assume  that  the  malicious  nodes  are  colluding,  i.e.  all  the  malicious  nodes  will 
report  that  a  fictitious  event  occurred  at  a  particular  location  (which  is,  of  course,  different 
from  the  true  event  location).  In  this  regard,  the  case  where  all  three  nodes  are  malicious 

is  as  compatible  as  the  case  where  all  three  nodes  are  benign.  Let  us  consider  the  3-node 

benign  case.  Since  all  of  the  3  nodes  are  benign,  the  total  discrepancy  should  be  small.  So 
what  we  need  is  a  function  with  the  following  properties: 

•  Given  a  small  e,  the  compatibility  function  returns  a  value  close  to  1 

•  Given  a  large  e,  the  compatibility  function  returns  a  value  close  to  —1 
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•  Given  any  intermediate  value  of  e,  the  compatibility  function  returns  a  value  between 
—1  and  1 

•  The  compatibility  function  is  monotonic  and  (preferably)  smooth 
Hence  for  the  3-node  benign  case,  we  can  define 

r{i,  j,  k,  b,b,b)  =  l-  (5.4) 

where  Ti  is  a  threshold  on  the  error  term  e.  In  equation  (5.4),  e  is  the  total  discrepancy 
of  the  triple.  The  larger  e  is,  the  less  likely  that  this  triple  has  all  benign  nodes,  because 
a  large  e  indicates  that  the  nodes  in  the  triple  do  not  agree  with  each  other.  In  equation 
(5.4),  the  function  of  the  form  1/(1  is  generally  referred  to  as  a  sigmoid  function 

[57].  The  advantage  of  using  a  sigmoid  function  is  that  the  compatibility  function  will  be 
a  smooth  curve  between  -1  and  1.  We  show  some  example  parameters  of  equation  (5.4)  in 
Figure  5.1. 
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Figure  5.1:  Some  example  parameter  settings  of  equation  (5.4) 


The  compatibility  function  for  the  3-node  malicious  case  is  identically 
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^  ^  g-ai(e-Ti) 


(5.5) 


since  the  malicious  nodes  are  assumed  to  be  colluding.  Values  for  ai  and  Ti  and  sensitivity 
to  their  choices  will  be  discussed  in  Section  5.4. 


Case  2  to  Case  6 

If  there  is  only  one  malicious  node  in  the  triple,  the  other  two  nodes  must  be 
benign.  This  one-node  malicious  case  is  exactly  the  same  as  the  one-node  benign  case, 
because  in  the  one-node  benign  case,  the  other  two  malicious  nodes  are  colluding. 

Let  us  look  at  the  one-node  malicious  case  hrst.  For  one-node  malicious  case,  the 
single  malicious  node  in  the  triple  reports  an  incorrect  event  location.  Hence  the  local¬ 
ization  result  will  be  shifted  due  to  the  “contamination”  from  the  malicious  node.  As  a 
result,  inconsistency  among  the  three  nodes  exists.  Since  the  localization  result  is  incorrect, 
every  node  in  this  triple  will  have  a  discrepancy,  ei,  I  G  {a,  b,  c}  G  {1, 2,  •  •  •  ,  n}.  Since  the 
malicious  node  is  the  minority  among  the  three,  it  will  tend  to  contribute  a  higher  error. 
We  can  design  our  compatibility  function  based  on  this  concept: 

Given  a  particular  triple,  {i,j,k),  and  considering  the  labelings  {j,b)  and 

{k,  b),  we  then  introduce  a  new  variable  x  which  measures  the  relative  contribution  of  node 
i  to  the  total  discrepancy 

e  Q  +  G  A 

We  also  need  to  check  if  the  total  discrepancy,  e,  of  this  triple  is  small.  In  (5.5),  Ti  was 
used  to  indicate  a  signihcant  inconsistency,  and  we  use  it  again  here  for  the  same  purpose. 
If  e  <  Ti,  then  this  triple  should  be  either  all-benign  or  all-malicious.  Therefore,  we  want 
a  compatibility  function  which  has  the  following  properties 

•  First  check  if  e  is  small.  If  yes,  the  compatibility  function  should  return  a  negative 
value^.  Otherwise,  check  x  as  follows: 

e  is  quite  small  or  quite  large,  we  are  confident  that  our  labeling  is  consistent,  and  r  is  allowed  to 
go  to  ±1.  However,  in  the  1-node  different  case,  we  have  less  confidence  overall.  Fortunately,  relaxation 
labeling  provides  a  mechanism  for  expressing  this  lack  of  confidence;  we  simply  bound  r  to  be  in  the  range 
-0.5  <  r  <  0.5. 


(5.6) 
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—  Given  a  small  x,  the  compatibility  function  should  return  a  negative  value 
—  Given  a  large  x,  the  compatibility  function  should  return  a  positive  value 
—  Given  any  intermediate  value  of  x,  the  compatibility  function  is  a  smooth  curve 

The  compatibility  function  for  one-node  malicious  (or  one-node  benign)  case  is 
then  defined  as 


r{i,j,k,m,b,b)  =  —0.5  if  e  <  Ti  (5.7) 

Otherwise,  if  e  >  Ti,  we  use  the  following  form  for  the  compatibility  function  for  one-node 
malicious  (one-node  benign)  case 


r{i,j,k,m,b,b) 


1 

1  _|_  e-“2(a;-r2) 


-  0.5 


(5.8) 


where  x  =  Cj/e,  the  ratio  of  the  discrepancy  by  the  malicious  node  i  divided  by  the  total 
discrepancy,  e.  In  equation  (5.8),  the  larger  x  is,  the  larger  r  is.  That  is  to  say,  if  we  find 
that  for  a  particular  triple  (i,  j,  k),  the  discrepancy  coming  from  node  i  is  a  large  component 
of  the  total  discrepancy  e,  then  we  have  a  high  confidence  to  claim  that  node  i  is  malicious 
and  nodes  j  and  k  are  benign.  Some  example  parameter  choices  of  (5.8)  are  illustrated  in 
Figure  5.2. 


The  other  one-node  benign/malicious  compatibility  functions  are  defined  in  the 
same  manner  as  (5.8).  We  summarize  all  8  compatibility  functions  in  Table  5.1.  Note  that 
in  Table  5.1,  x,^  =  e^/e,  where  C.  =  i,j,k. 

We  have  observed  that  the  concept  behind  the  compatibility  function  in  (5.4)  al¬ 
most  always  matches  with  real  data  (see  Section  5.4).  In  other  words,  what  we  have  found 
from  the  real  data  is  that  for  all  benign  nodes,  the  total  discrepancy,  e,  is  always  small. 
However,  the  design  concept  of  the  compatibility  function  in  (5.8)  is  less  robust  on  real 
data.  That  is  to  say,  some  small  portion  of  the  triples  having  one  malicious  node  and  two 
benign  nodes,  will  violate  the  rule  of  x  =  ^  <  |,  where  denotes  the  discrepancy  from 
the  malicious  node.  For  this  reason,  we  scale  the  compatibility  functions  to  be  between  -0.5 
and  0.5  in  (5.8).  We  have  discovered  that,  empirically,  this  makes  the  convergence  of  the 
confidences  more  stable. 


Table  5.1:  Compatibility  functions 


r{i,j,  k,  m,  m,  m) 

r{i,  j,  k,  m,  m,  b) 

-0.5  if  e  <  Ti 

1/(1 _  0.5  ife>ri 

r{i,  j,  k,  m,  b,  m) 

-0.5  if  e  <  Ti 

1/(1 -he-“2(^i-r2)) -0.5  ifoTi 

r{i,j,k,m,b,  b) 

-0.5  if  e  <  Ti 

1/(1  -h  _  0.5  if  e  >  Ti 

r{i,  j,  k,  b,  m,  m) 

-0.5  if  e  <  Ti 

1/(1  -h  e-“2(*-T’2))  _  0.5  if  e  >  Ti 

r{i,j,k,b,m,b) 

-0.5  if  e  <  Ti 

1/(1 -he-"2(^i-^2))  _  0.5  ife>ri 

r{i,j,k,b,b,m) 

-0.5  if  e  <  Ti 

1/(1  -h  e-"2(*fc-T’2))  _  0.5  if  e  >  Ti 

r{i,j,k,b,  b,  b) 

l-2/(l-he-“i("-^i)) 
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We  summarize  the  algorithm  of  relaxation  labeling  in  Table  5.2.  The  pseudo  code 
for  calculation  of  qi{Xj)  in  equation  (5.3)  is  listed  in  Table  5.3.  In  our  simulations  in  Section 
5.3.1,  we  specifically  removed  the  exceptional  cases  where  some  sensor  nodes  are  equidistant 
to  both  true  event  and  fictitious  event  locations.  For  example,  we  show  one  sensor  network 
consisting  of  four  nodes  in  Figure  5.3.  In  that  figure,  we  have  two  malicious  nodes  and 
two  benign  nodes.  The  two  benign  nodes  are  equidistant  to  both  the  true  event  and  the 
fictitious  event.  Hence  they  appear  to  be  consistent,  but  the  result  is  incorrect  (all  agreeing 
on  the  fictitious  event  in  Figure  5.3).  For  the  purpose  of  simulations,  we  have  excluded 
such  ill-conditioned  cases  where  sensor  nodes  are  equidistant  to  both  the  true  and  fictitious 
events. 


Table  5.2:  Algorithm  for  relaxation  labeling 

Use  range  estimates  Si  to  calculate  e*,  ej,  ek  for  all  triples; 
Initialize  Pi{m)  ~  Pi{b)  —  0.5; 

Set  K  to  be  a  small  positive  number; 
while  (for  all  nodes  k  <  Pi{m)  <  1  —  k)  { 
for  each  node  { 

Calculate  qi{m)  and  qi{h)  using  Table  5.3; 

Calculate  Di{m)  =  Pi{m){l  -|-  qi{m))  +  Pi{h){l  +  qi{b)); 
Calculate  Pi{m)  =  Pi{m)  x  (1  -|-  qi{m)) / Di] 

Calculate  Pi{b)  =  1  —  Pi{m)] 

}} 
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Table  5.3:  Subroutine  for  calculation  of  qi{Xj) 

Given  node  number  i  and  Ai; 
total  =  0.0; 

n  is  the  total  number  of  nodes; 
for  (k  from  0  to  number  of  nodes)  { 
if  (  k  /  i)  { 

for  (1  from  0  to  number  of  nodes)  { 
if  (  (i  /  1  and  k  /  1))  { 
for  (Ai  from  m  to  6)  { 
tot  =  0.0; 

for  (A2  from  m  to  6)  { 

Calculate  r(i,  k,  1,  Ai,  A2,  A3)  using  Table  5.1; 
tot  +=  r(i,  k,  1,  Ai,  A2,  As)  x  Pi{\‘2)] 

} 

total  +=  tot  X  Pfc(Ai); 

}}}}} 

return  total l{n  —  1) /(n  —  2) ; 


Figure  5.3:  A  sensor  network  consisting  of  four  nodes.  Node  1  and  node  4  are  malicious; 
while  node  2  and  node  3  are  benign.  In  this  example,  node  2  and  node  3  are  equidistant  to 
both  the  true  event  and  the  fictitious  event.  Hence  they  appear  to  be  consistently  reporting 
on  the  fictitious  event.  Ill-conditioned  cases  like  this  have  been  excluded  in  our  simulations. 
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5.3  Experimental  Results  of  Detecting  Malicious  Nodes  Us¬ 
ing  Relaxation  Labeling 


5.3.1  Simulation 

We  randomly  generate  the  positions  of  n  nodes  in  a  test  field  of  [0,10]  x  [0,10]. 
Among  the  n  nodes,  Um  are  malicious.  We  simulate  the  sensor  measurements  Zi  using  the 
sensor  model  in  (2.2)  by  setting  a  =  4 


Zi  =  Wi, 


(5.9) 


where  a  is  the  amplitude  of  the  event,  di  is  the  actual  distance  from  sensor  i  to  the  event, 
and  w  is  additive  Gaussian  noise.  Without  loss  of  generality,  we  set  a  =  1.0,  and  equation 
(5.9)  becomes 


Zi  =  -^  +  Wi 


(5.10) 


For  benign  nodes,  di  is  the  distance  from  node  i  to  {xb,yb),  the  location  of  the 
true  event.  Similarly,  for  malicious  nodes,  di  is  the  distance  from  node  i  to  {xmiVm)-,  the 
fictitious  event  location  that  the  malicious  nodes  report.  To  get  an  estimate  of  how  large 
we  should  choose  the  noise  variance  in  our  simulations,  consider  the  position  of  a  sensor 
node,  (xj,  yi).  Xi  and  yi  are  both  uniform  random  variables,  x*  ~  U  [0, 10]  and  y*  ~  U  [0, 10]. 
Consider  that  the  event  is  positioned  at  (0,0),  hence  df  =  (x?  +  yf).  The  expected  value 
of  X?  is  so  the  expected  value  of  df  is  roughly  Hence  a  typical  value  of  ^  is 
^  =  0.015.  So  noise  variance  =  0.01  is  large  as  compared  to 


To  obtain  range  estimates  from  each  sensor  node,  we  use  (2.4)  to  calculate 


(5.11) 


In  our  first  experiment,  we  set  {xb,  yb)  =  (3.0, 3.0)  and  (xm,  ym)  =  (7.0,  7.0).  All  of 
the  probabilities  are  initialized  at  Pi{\j)  —  0.5.  The  number  of  nodes,  n,  is  7,  and  the  first 
two  nodes  are  always  chosen  to  be  malicious,  i.e.  rim  =  2.  The  parameters  are  determined 
experimentally  (see  Section  5.4)  as  ai  =  4.0,  Ti  =  0.1,  a2  =  2.0,  T2  =  1/3.  Note  that  ai 
and  a2  merely  control  the  slope  of  the  sigmoid  function  in  Equation  (5.4)  and  (5.8),  and 
small  changes  of  ai  and  02  generally  do  not  have  any  influence  on  the  simulation  outcome. 
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We  choose  ai  to  be  larger  than  02  because  for  a  triple  to  be  consisting  of  all  benign  or  all 
malicious  nodes  is  a  more  strict  condition  than  one  node  benign  or  one  node  malicious^. 
T2  is  also  almost  always  set  to  1/3  since  we  assume  that  each  node  in  a  triple  contributes 
equally.  Hence,  the  two  parameters  that  impact  the  performance  are  Ti  and  the  noise 
variance  where  Ti  can  be  controlled  by  the  user  but  ci^  is  determined  by  the  system 
environment. 

To  verify  the  correctness  of  our  relaxation  labeling  algorithm,  we  set  =  1.0  x 
10“®,  essentially  noise  free.  The  relaxation  labeling  process  took  22  iterations  to  converge, 
and  the  time  it  took  was  less  than  1  second  on  a  Dual  1.8GHz  PowerPC  G5  computer.  We 
plot  an  instance  of  the  (random)  locations  of  the  7  nodes  in  Figure  5.4.  Also,  based  on  6 
reported  from  each  sensor,  we  plot  a  circle  for  each  node  in  Figure  5.4.  In  Figure  5.5,  we 
show  the  convergence  of  Pi{b),  i  =  1,...,7.  Since  Po{b)  and  Pi{b)  converged  to  0,  node  0 
and  node  1  are  found  to  be  malicious. 


Figure  5.4:  Simulation  Example  of  7  nodes.  Two  of  the  nodes  are  malicious. 


^Recall  that  for  a  triple  to  be  all  benign  or  all  malicious,  its  e  has  to  be  small.  Choosing  oi  larger  will 
make  the  sigmoid  function  more  like  a  threshold  function 
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Figure  5.5:  Convergence  of  P{b) 

Next,  to  examine  if  the  relaxation  labeling  process  is  correct,  we  repeat  the  same 
experiment  using  cj^  =  1.0  x  10“®  and  Ti  =  0.01  for  10,000  times.  In  all  of  the  10,000  ex¬ 
periments,  the  locations  of  the  nodes  are  different.  The  result  from  the  10,  000  experiments 
are  all  correct. 

Next,  we  look  at  the  impact  of  parameters  on  the  system  performance.  At  a  par¬ 
ticular  Ti  and  noise  variance  cj^,  we  repeat  the  relaxation  labeling  experiment  (2  malicious 
nodes  out  of  7  nodes)  for  100  times,  each  time  the  nodes  are  positioned  randomly.  For  any 
experiment  out  of  100  times,  we  will  consider  the  entire  experiment  a  failure  if  ANY  of  the 
7  nodes  is  misclassified.  Note  that  this  is  a  very  strict  failure  definition.  We  then  perform 
another  100-times  experiment  at  different  Ti  and  noise  variance  ci^  values.  The  failure  rate 
(out  of  100  experiment)  is  shown  in  Figure  5.6  as  a  3D  graph.  Note  that  in  Figure  5.6, 
the  y-axis  is  log-scale  (base  10),  hence  noise  variance  ci^  ranges  from  1.0  to  1.0  x  10“®.  We 
can  see  that  at  1.0  x  10“^  <  <  1.0  x  10“®  and  0.1  <  Ti  <  0.75,  the  error  rates  are 
almost  0.  The  error  percentage  goes  up  as  Ti  goes  up.  As  a'^  goes  up,  the  failure  rate 
goes  up  considerably.  These  two  phenomena  are  expected  because  as  ci^  increases,  more 
and  more  of  the  localizations  done  by  each  triple  would  be  inaccurate.  Hence  the  overall 
system  error  rate  would  go  up.  There  should  a  suitable  range  for  Ti  because,  as  in  (5.4),  it 
controls  threshold  for  determining  a  triple  to  be  all-benign  (all-malicious)  or  not.  If  Ti  is  set 
too  high,  then  no  matter  how  a  triple  behaves,  it  will  always  be  regarded  as  an  all-benign 
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(all-malicious)  triple.  This  will  create  error  for  the  overall  system  performance.  Hence  if  Ti 
is  set  too  high  and  as  it  increases,  the  error  rate  goes  up.  The  number  of  iterations  required 
to  reach  convergence  corresponding  to  the  Ti  and  a‘^  values  in  Figure  5.6  are  illustrated  in 
Figure  5.7.  We  can  see  that  as  increases,  the  number  of  iterations  required  to  converge 
also  increases.  This  is  because  that  higher  leads  to  more  inaccurate  localizations,  and 
there  are  more  inconsistencies  in  the  system.  Increasing  Ti  also  makes  the  required  number 
of  iterations  to  go  up  a  bit,  although  the  trend  is  less  obvious.  If  we  increase  Ti  too  large, 
there  will  be  more  triples  incorrectly  regarded  as  one-benign  (one- malicious).  Such  incon¬ 
sistencies  lead  to  the  slight  increase  in  the  number  of  iterations  in  Figure  5.7.  In  Figure  5.6, 
the  average  time  to  perform  any  100-times  experiment  is  3.489  seconds  on  a  Dual  1.8GHz 
PowerPC  G5  computer. 


Figure  5.6:  The  effect  of  Ti  and  ci^  on  the  system  performance 


Finally,  we  fix  the  number  of  malicious  nodes  at  2,  and  increase  the  number  of 
total  nodes  in  the  network.  We  also  fix  Ti  =  0.5  and  repeat  the  experiments  for  100  times, 
each  time  with  different  random  node  locations.  The  failure  rate  (out  of  100  times)  is  shown 
in  Figure  5.8  at  three  different  noise  levels.  In  Figure  5.8,  we  can  see  that  the  failure  rate 
slightly  goes  down  as  the  number  of  nodes  gradually  increases.  This  agrees  with  intuition 
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Figure  5.7:  The  number  of  iterations  required  to  reach  convergence  corresponding  to  the 
Ti  and  values  in  Figure  5.6 


because  we  are  detecting  the  malicious  nodes.  As  the  number  of  total  nodes  increases,  there 
are  more  consistency  in  the  system  because  the  number  of  benign  nodes  go  up.  Note  that  in 
Figure  5.8  ,  under  the  assumption  of  dense  deployment,  even  if  we  misclassify  a  small  num¬ 
ber  of  benign  nodes,  we  can  still  use  the  remaining  benign  nodes  to  correctly  localize  events. 


5.3.2  Field  Experiment 

In  this  section,  we  use  an  existing  set  of  field  experiment  data  to  test  our  relaxation 
labeling  algorithm  [46] .  The  field  experiment  was  conducted  in  an  outdoor  environment  us¬ 
ing  MICA2  motes  [29]  running  TinyOS  [28].  8  nodes  are  deployed  in  a  40  x  40  feet  field, 
as  illustrated  in  Figure  5.9.  The  node  positioned  at  (5,5)  of  Figure  5.9  does  not  know  its 
position,  hence  the  original  purpose  of  the  experiment  [46]  is  to  use  the  other  nodes  (node 
0,  node  1,  •  •  • ,  node  6)  to  localize  its  position. 

The  node  positioned  at  (5,5)  is  denoted  as  the  “event”  node.  The  reader  might 
find  it  helpful  to  think  of  it  as  the  event.  The  rest  of  the  nodes  are  sensor  nodes  (since  they 
know  their  respective  positions),  and  they  localize  the  event  node  based  on  the  received 
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Figure  5.8:  Failure  rate  for  a  network  of  various  number  of  nodes.  At  each  network 
only  2  nodes  are  malicious. 
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Figure  5.9:  Deployment  of  sensor  nodes  in  the  field  (Unit:  4  feet).  The  sensor  nodes  are 
denoted  as  circles,  while  the  event  node,  positioned  at  (5,5),  is  denoted  as  a  cross. 
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signal  from  it.  This  experiment  is  isomorphic  to  our  problem  formulation  in  Chapter  3 
since  the  event  node  plays  the  role  of  the  event. 

At  each  sensor  node  location,  we  plot  circles  using  the  range  estimates  S,  and  the 
result  is  illustrated  in  Figure  5.10.  We  observe  from  Figure  5.10  that  due  to  noise,  some 
nodes  report  longer  S;  while  some  report  shorter  5.  However,  the  intersections  of  most  of 
the  circles  fall  on  or  near  the  event  node  location,  (5,  5),  in  Figure  5.10. 


Figure  5.10:  For  each  measurement  at  sensor  nodes,  we  plot  a  circle  using  the  range  esti¬ 
mates.  Note  that  the  event  node  is  at  the  center  of  the  field,  (5,5).  The  intersections  of 
most  of  the  circles  are  either  on  or  near  the  event  node  due  to  noise. 


To  create  colluding  attacks,  we  chose  node  0  and  node  1  to  be  malicious  and  replace 
their  range  estimates  with  the  distance  from  each  node  to  (1,1),  plus  some  noise  distur¬ 
bances.  The  noise  variance  chosen  for  the  malicious  node  is  small,  =  0.000001,  since 
these  nodes  are  colluding.  That  is  to  say,  node  0  and  node  1  report  that  the  event  occurs 
at  (1,1).  The  range  estimates  for  nodes  2,  ...,  6  are  chosen  from  the  field  measurements 
we  have.  In  future  work,  malicious  nodes  could  be  modeled  as  having  randomness  in  their 
behavior. 
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Using  our  relaxation  labeling  algorithm,  we  set  the  threshold  in  Equation  (5.4)  to 
1.5.  This  threshold  is  higher  than  the  Ti  in  Section  5.3.1  because  the  range  of  the  sensor 
field  is  40  by  40  feet,  larger  than  our  simulations.  Recall  that  Ti  is  the  threshold  on  the  total 
discrepancy.  As  the  scale  of  the  system  in  this  field  experiment  is  larger,  we  are  expecting  a 
larger  total  discrepancy.  Hence  Ti  is  larger.  All  the  other  parameters  are  unmodified.  The 
probabilities  converged  at  iteration  =  104.  We  show  the  convergence  of  the  probabilities  in 
Figure  5.11,  and  we  successfully  identify  node  0  and  node  1  to  be  malicious  and  nodes  2, 
...,  6  to  be  benign. 


Figure  5.11:  The  probability  of  each  sensor  node  being  benign,  P{b).  Po{b)  and  Pi{b)  go 
down  to  0,  which  means  that  these  two  nodes  cannot  be  benign. 


In  the  field  measurement  data  that  we  obtained,  each  node  has  several  range 
estimates  of  the  event.  We  randomly  pick  two  range  estimates  from  every  node,  so  we  have 
2^  =  128  different  experiments.  Again,  we  pick  the  first  2  nodes  to  be  malicious,  and  we 
replace  their  range  estimates  5  with  the  distance  to  (1,1).  We  successfully  identified  node 
0  and  node  1  to  be  malicious  nodes  in  all  128  experiments. 
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5.3.3  Comparison  with  an  Existing  Algorithm 

Liu  et  al.  propose  a  voting-based  secure  localization  algorithm  [46].  We  illustrate 
the  voting  algorithm  in  Figure  5.12.  The  sensor  field  is  divided  into  different  non-overlapping 
cells,  and  each  cell  has  a  counter.  At  the  beginning,  all  counters  are  reset  to  zero.  At  each 
node  location,  we  plot  a  circle  using  the  range  estimate.  Si,  as  the  radius.  Those  cells  on 
the  circle  are  incremented  by  one.  Note  that  Liu  et  al.  also  consider  a  measurement  error 
term,  ip,  such  that  the  circles  are  actually  rings  [46].  After  all  circles  have  been  drawn,  we 
pick  the  cell  having  the  highest  count  as  the  event  location.  The  choice  of  cell  sizes  and 
other  statistical  measures  to  make  the  voting  algorithm  more  robust  are  also  discussed  in 
[46].  Note  that  this  is  also  a  Hough-transform  [19,  30,  3]  like  approach,  where  consistent 
solutions  get  high  increment  values. 


Figure  5.12:  Illustration  of  the  voting  algorithm  (Reproduced  from  [46]). 

For  our  relaxation  labeling  algorithm,  we  identify  the  malicious  nodes  first,  and 
remove  them  from  the  network.  The  localization  is  then  done  by  collecting  the  Si  from  the 
benign  nodes  and  using  (2.11). 


Here  we  are  demonstrating  that  the  voting  algorithm  produces  more  incorrect 
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location  estimates  as  the  number  of  malicious  nodes  increases.  For  the  voting  algorithm,  we 
choose  the  cell  size  to  be  1.0  x  1.0,  and  we  denote  the  center  of  the  cell  as  the  event  location, 
according  to  [46].  We  repeat  the  localization  experiments  for  100  times,  each  time  using  a 
different  (and  random)  set  of  20  node  locations.  The  estimated  event  locations  using  each 
algorithm  are  shown  in  Figure  5.13  under  various  number  of  malicious  nodes.  In  Figure 
5.13(a),  there  is  only  one  malicious  node  present.  The  number  of  malicious  nodes  increase 
from  seven  to  nine  in  Figure  5.13(b),  Figure  5.13(c)  and  Figure  5.13(d).  The  relaxation 
labeling  algorithms  can  correctly  localize  the  event  location,  (3.0, 3.0),  with  a  high  number 
of  malicious  nodes.  Hence  the  estimated  event  locations  (marked  as  circles)  in  Figure  5.13 
are  fairly  close  to  each  other  and  to  the  true  event  location.  However,  as  the  number  of 
malicious  nodes  increases,  a  higher  percentage  of  estimates  using  voting  algorithm  (marked 
as  triangles)  is  close  to  the  fictitious  event  location.  The  choice  of  cell  size  of  1.0  x  1.0  is  not 
related  to  this  trend,  since  we  have  observed  the  same  phenomenon  at  cell  size  of  0.5  x  0.5. 
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Figure  5.13:  Comparison  of  the  performance  of  the  relaxation  labeling  algorithm  and  the 
voting  algorithm  in  a  20-node  network.  In  (a),  only  one  node  is  malicious.  In  (b),  (c)  and 
(d),  the  number  of  malicious  nodes  increase  from  seven  to  nine.  Note  that  both  algorithms 
perform  equally  well  when  there  are  2-6  malicious  nodes.  Since  the  experiments  are  re¬ 
peated  for  100  times,  each  with  a  different  set  of  random  node  locations,  overlapping  results 
are  marked  at  the  same  spots  in  (a)  -  (d).  The  voting  algorithm  occasionally  concludes 
that  the  event  is  actually  at  the  fictitious  event  location,  (7.0,  7.0).  However,  our  algorithm, 
relaxation  labeling,  correctly  localize  the  event  at  (3.0,  3.0)  in  every  case. 
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5.4  Choice  of  Parameters 

In  Figure  5.6,  we  can  see  that  the  system  performance  is  effected  by  two  factors: 
the  noise  in  the  measurement  process,  which  is  not  controlled  by  the  user,  and  the  parame¬ 
ters  Ti  and  T2  that  can  be  controlled  by  the  user.  In  this  section,  we  would  like  to  develop 
ways  to  choose  useful  values  of  parameters. 


5.4.1  The  Sigmoid  Function 

Before  we  can  do  that,  we  would  like  to  remind  the  reader  of  the  sigmoid  function 
in  (5.4).  Given  a  set  of  {i,  j,  k,b,b,b),  (5.4)  is  essentially 

2 


r(e)  =  1  — 


which  is  in  the  form  of 


r  : 


1  -h 


ol  ,  udI 


(5.12) 


We  know  from  (2.13)  that  e  is  the  sum  of  three  discrepancies,  e;,  I  G  {a,  b,  c}  G  {1,  2,  •  •  •  ,  n}, 
within  a  triple,  hence  e  >  0.  So  the  Domain  of  (5.12)  is 


{e  G  M^|e  >  0} 


The  Range  of  (5.12)  is 

{r(e)  G  M^|r(e)  G  [—1, 1]} 

since 


limr  =  —1 

e— >00 

lim  r  =  1 

e— >— 00 

The  sigmoid  function  in  (5.4)  is  also  a  solution  to  the  differential  equation 


(5.13) 
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To  see  this,  we  take  derivative  of  r(i,  j,  k,  b,  b,  b)  with  respect  to  e,  and  we  get 

2  1  —  r 

1  —  r  1  —  r 

=  2a, (—)(—-!) 

T  —  r,  ,1  +  r, 

=  -2a.(^){^) 

=  Y(r"  -  1)  (5.14) 

The  parameter  Ti  is  the  inflection  point  of  the  function  r(e).  That  is  to  say,  ^  =  0  at 
e  =  Ti.  To  see  this,  we  again  take  the  derivative  of  (5.14)  and  set  it  equal  to  0,  and  we  have 


d^r 

de^ 


ai  dr 


CXl 


CKl 


(2r)^(r^-l) 


0^1  /  2  -I  \ 

Y^{r  -  1) 


(5.15) 


Therefore,  we  have  r  =  0,  r  =  — 1  or  r  =  1  as  the  solution  to  (5.15).  When  r  =  —1,  it 
is  clear  that  r  =  —  1  is  the  asymptotic  limit  of  e  ^  oo,  hence  r  =  —  1  is  merely  a  trivial 
solution  (|^  =  0).  Similarly,  r  =  1  is  another  trivial  solution  to  (5.15).  The  interesting 
solution  is  when  r  =  0,  we  have 

2 

r(i,  ?', /c,  6, 6, 6)  =  1 - ^  =  0 

2 

l  +  e-ai(e-Ti)  “  ^ 

l  +  g-aih-T)  ^  2 

g-ai(e-Ti)  _ 

ai(e  — Ti)  =  0  (5.16) 


Hence  e  =  Ti.  The  property  that  e  =  Ti  is  an  inflection  point  will  be  used  in  the  following 
sections. 


Analysis  of  the  sigmoid  function  in  (5.8)  can  be  derived  in  a  similar  manner. 
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Recall  from  equations  (5.4)  and  (5.8),  four  parameters  are  of  interest 

1.  Ti:  controls  the  shift  in  (5.4).  See  Figure  5.1 

2.  ai:  controls  the  slope  in  (5.4).  See  Figure  5.1 

3.  T2:  controls  the  shift  in  (5.8).  See  Figure  5.2 

4.  a^'-  controls  the  slope  in  (5.8).  See  Figure  5.2 

We  now  examine  each  of  them  as  follows. 

5.4.2  Choice  of  Ti 

We  would  like  to  choose  suitable  values  of  Ti  so  that  the  system  error  rate  in  terms 
of  misclassifying  nodes  into  wrong  labels  would  be  minimal.  However,  it  is  difficult  to  write 
any  analytic  equation  that  directly  relates  the  system  error  rate  to  Ti. 

One  alternative  is  to  think  of  qi{\)  as  the  objective  function,  and  relate  qi{\)  to  Ti. 
We  call  this  approach  the  optimization  perspective.  Another  alternative  is  to  simply  learn 
from  the  experimental  data.  Our  design  objective  in  (5.4)  is  to  choose  Ti  as  a  threshold 
for  the  total  discrepancy  e.  By  modeling  the  experimental  data,  we  can  choose  a  suitable 
Ti  that  matches  our  design  objective  in  (5.4).  We  call  this  approach  data  dependency 
perspective. 

Optimization  perspective 

In  relaxation  labeling,  our  objective  is  to  calculate  F’j(A),  the  probability  of  object 
i  having  label  A.  What  drives  F’j(A)  is  qi{\).  Hence  we  can  begin  the  optimization  approach 
by  focusing  on  qi{\).  We  denote  the  correct  value  for  qi{X)  to  be  Gi{\).  When  node  i  is 
indeed  of  type  A,  Gi{X)  is  1.0.  In  other  words,  qi{\)  should  be  1.0  when  node  i  actually  has 
label  A.  On  the  other  hand,  Gi(A')  =  —1.0  V  A'  /  A  because  we  are  labeling  node  i  with 
a  wrong  label  X' . 

Since  we  expect  ®(A)  to  be  G'i(A),  we  can  design  the  following  objective  function 

i  A 


(5.17) 
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where  i  =  1, . . . ,  n  are  all  nodes  in  the  sensor  network,  and  A  G  {m,b}.  We  will  take  deriva¬ 
tive  of  this  objective  function,  and  set  it  to  zero. 


Using  chain  rule,  we  have 
d 


EEteW-CiWt  =  (5-18) 


i  X 


i  A 


dTi 


EE  2  fe(A)  -  G,(A)] 


i  X 


%(A) 

dTi 


(5.19) 


From  the  definition  of  (?i(A),  we  can  expand  the  term  in  (5.19)  as 

QiW  =  vEEEw')E  Pk{X")r{i,j,k,X,X',X'‘ 


j  k  X' 


X" 


%(A) 

dTi 


j  k 


dTi 


(5.20) 

(5.21) 


Note  that  now  we  are  treating  Ti  as  a  parameter  of  r.  That  is  to  say,  (5.4)  is  treated  as 
r(ri)  by  holding  the  e  in  (5.4)  as  a  constant.  Take  derivative  of  r{i,j,  k,  A,  A,  A)  with  respect 
to  Ti,  we  get 

dr{i,j,k,X,X,X) 
dTi 


2(1  -h 

(5.22) 

(5.23) 

,1  —  ,  1  —  r , 

2a.{  2  )(1-  2  > 

(5.24) 

,  1  —  r , ,  1  -|-  r , 

2«.(  2X2' 

(5.25) 

(5.26) 

This  arrives  at  a  similar  differential  equation  as  (5.14). 


Substituting  (5.26)  into  (5.21),  we  get 
dqi{X) 


dTi 


j  k 


dTi 


1 

N 


EE  ^AA)^fc(A)y(l-r2(i,j,fe,A,A,A)) 

j  k 


(5.27) 
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where  j  =  1,  •  •  •  ,  n,  A:  =  1,  •  •  •  ,n,  j  ^  i,  k  ^  i. 
Hence  (5.18)  becomes 


dTi  G'j(A)]  — 

E^Ea  -G'i(A)]  jj  \j2jJ2kPjWPk{X)ai{'^-r‘^{i,j,k,X,\,X)) 


Setting  (5.28)  to  0,  we  have 


(5.28) 


E  -  C'* W)  E  E  Pj{X)Pk{X){l  -  r\i,j,  k,  X,  A,  A))  =  0  (5.29) 

i  X  j  k 


The  next  step  is  to  relate  (5.29)  to  Ti.  We  can  achieve  this  by  using  the  Taylor 

series 

e*  =  1  +  X  +  +  •  •  •  ~  1  +  X  (5.30) 

2  6 

and  we  have 


r 


2 

1  + 

2 

1  +  1  —  Q;i(e  —  Ti) 


2  -  ai(e  -  Ti) 


Using  (5.33),  we  can  have 


(5.31) 

(5.32) 

(5.33) 


1  —  =  (1  —  r)(l  +  r) 

2  2 
^2-ai(e-ri)^^^"  2-ai(e-Ti)^ 

_  4  4 

2-ai(e-Ti)  "  [2  -  ^^(e  -  ri)]^ 

r  2-«i(e-ri) _ 1 

[(2-ai(e-Ti))2  (2  -  ai(e  -  ri))2 

4(l-ai(e-ri)) 


(5.34) 

(5.35) 

(5.36) 

(5.37) 


(2-ai(e-ri))2 


(5.38) 
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Substituting  (5.38)  into  (5.29),  we  have 

5:(„(a)  -g.w)y,y. 

1  X  J  k 

One  condition  that  (5.39)  will  hold  is  qi{X}  =  Gj(A)  V  i,X.  However,  it  is 
impossible  to  derive  an  analytic  solution  of  Ti  from  this  condition.  The  other  condition 
that  (5.39)  will  hold  is  4(1  —  Q;i(e  —  Ti))  =  0,  that  is 

fi  =  e  -  —  (5.40) 

ai 

The  analytic  solution  in  (5.40)  is  still  difficult  to  be  applied  to  real  conditions  since 
e  is  a  variable  that  is  different  for  every  triple  in  the  sensor  network.  To  obtain  a  useful 
solution  for  choosing  Ti,  we  take  the  expected  value  of  (5.40) 

Ti  =  E[e]-—  (5.41) 

CHI 

~  E[€]  (5.42) 

where  E[u\  is  the  expected  value  of  u  and  we  have  assumed  that  in  practical  cases,  ai  can 
be  chosen  to  be  large  enough  that  ^  is  negligible  as  compared  to  e.  Hence  we  arrive  at  the 
analytic  solution  that  Ti  can  be  chosen  by  E[€].  The  probability  density  function  of  e  can 
be  approximated  by  a  histogram  (see  the  following  section  for  examples)  and  hence  obtain 
the  expected  value  of  e. 

Data  dependency  perspective 

Based  on  our  analysis  in  the  previous  two  sections,  we  know  that  Ti  controls  the 
location  where  r(e)  =  0.  When  e  =  Ti,  it  is  also  the  location  of  an  inflection  point.  Hence 
Ti  lies  in  the  domain  of  r(e),  not  in  the  range  of  r(e).  To  choose  a  suitable  value  for  Ti,  we 
need  to  model  e  well. 

In  other  words,  the  suitable  choice  of  Ti  depends  on  e,  the  total  discrepancy  of  a 
triple.  To  choose  Ti  is  a  data-dependent  issue. 

From  (2.13),  we  know  that  e  is  the  sum  of  three  respective  discrepancies  in  a  triple. 
There  are  two  terms  in  (2.13),  namely  6i  and  (xq  —  x;)^  +  (yo  ~  yz)^-  is  the  reported 
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range  from  the  node  to  the  event,  hence  it  is  proportional  to  the  scale  of  our  coordinate 
system.  ^ {xq  —  xiY  +  (yo  ~  ViY  is  the  distance  from  the  node  to  the  estimated  event  lo¬ 
cation,  hence  it  is  also  proportional  to  the  scale  of  our  coordinate  system.  Therefore,  as 
the  scale  of  our  coordinate  system  gets  larger,  Ti  needs  to  be  larger  (and  vice  versa)  since 
it  is  the  inflection  point  of  e.  For  example,  in  our  simulations,  the  scale  of  our  coordinate 
system  is  chosen  to  be  [0, 10]  x  [0, 10].  If  we  set  it  instead  to  be  [0, 100]  x  [0, 100],  then  e 
will  also  be  scaled  by  10  times,  and  Ti  will  need  to  be  similarly  scaled. 

The  other  factor  of  importance  to  Ti  is  the  measurement  noise,  Wi,  in  (2.2).  e 
measures  the  difference  between  the  reported  range,  5i,  and  the  calculated  distance  from 
the  node  to  the  event,  (xq  —  x;)^  -|-  {yo  —  yiY-  Regardless  of  whether  the  node  is  malicious 
or  benign,  noise  contributes  to  part  of  e.  Hence  the  suitable  choice  of  Ti  should  also  take 
noise  into  account. 

How  do  we  model  Ti  based  on  the  real  data?  In  most  situations,  prior  calibrations 
can  be  used  to  find  suitable  parameters  for  the  algorithm.  As  an  example  calibration,  we 
simulate  a  seven-node  system  with  two  malicious  nodes  whose  experimental  settings  are  the 
same  as  those  in  Section  5.3.1.  After  repeating  the  simulation  for  100  times,  we  show  the 
histogram  of  all  e  in  Figure  5.14.  We  denote  the  e  for  triples  which  are  all  benign  or  all 
malicious  as  ea-  Similarly,  we  denote  the  e  for  one  benign  or  one  malicious  triples  as  e^,.  In 
Figure  5.14,  the  histogram  of  ea  is  shown  in  red,  while  part  of  the  histogram  of  ej,  is  shown 
in  blue.  The  histogram  of  €b  larger  than  1  is  not  shown  because  the  maximal  €b  has  a  much 
larger  scale  than  any  of  Ca- 

We  can  see  from  Figure  5.14  that  most  of  the  e  from  all-benign  (or  all-malicious) 
triples  are  smaller  than  the  e  of  one-benign  (or  one-malicious)  triples.  This  is  reasonable 
since  there  is  some  disagreement  among  nodes  for  one-benign  or  one-malicious  triples. 

In  a  later  section,  we  will  need  the  largest  values  of  Ca,  therefore,  for  notational 
clarity,  we  denote  €a  in  descending  order  as 

Cal  >  ea2  >  •  •  •  >  CaeOGO  (5.43) 


since  there  are  6000  samples  from  a  seven-node  simulation  which  is  repeated  for  100  times. 
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I  One  benign/malicous  triples 
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Figure  5.14:  Histogram  of  e  for  all  triples  in  a  seven-node  simulation  environment.  Scale  is 
[0, 10]  X  [0, 10],  noise  variance  =  10  The  histogram  of  €a  is  shown  in  red,  while  part 
of  the  histogram  of  is  shown  in  blue.  Note  that  the  histogram  of  €b  larger  than  1  is  not 
shown  because  the  maximal  Cf,  has  a  much  larger  scale  than  any  of  e^. 


We  show  the  histogram  of  {cai,  •  •  •  ,  eaeooo}  at  =  10  ®  in  Figure  5.15. 

Similarly,  histogram  of  {cai,  •  •  •  ,  eaeooo}  at  =  10“^  in  Figure  5.16.  Histogram 
of  {cai,  •  •  •  ,  eaeooo}  at  =  10“^  is  shown  in  Figure  5.17. 
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Histogram  of  a,  =  10'® 
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Figure  5.15:  Histogram  of  {eai,  •  •  •  ,  eaeoooji  =  10  ®.  The  maximal  value  on  the  horizontal 
axis  is  chosen  relative  to  the  maximal  value  of  €a  for  clearer  presentation. 
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Figure  5.16:  {eai,-''  jCaeooo})  =  10  The  maximal  value  on  the  horizontal  axis  is 
chosen  relative  to  the  maximal  value  of  for  clearer  presentation. 
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Histogram  of  cr^  =  10"* 


Figure  5.17:  {eai,-''  jCaeooo})  =  10  The  maximal  value  on  the  horizontal  axis  is 
chosen  relative  to  the  maximal  value  of  €a  for  clearer  presentation. 

We  have  observed  that  as  the  noise  variance  of  our  system  increases,  {cai,  •  •  •  ,  eaeooo} 
also  increases.  First,  we  plot  versus  noise  variance  in  Figure  6.4(a).  We  choose 

the  ten  largest  to  represent  the  maximal  values  of  e^.  Next,  we  plot 
mean  of  Ca,  versus  noise  variance  cr^  in  Figure  6.4(b). 

In  order  to  construct  a  parametric  form  of  e,  we  observe  that  the  histogram  of 
is  close  to  the  pdf  of  an  exponential  random  variable  in  Figure  5.15,  Figure  5.16  and  Figure 
5.17.  The  probability  density  function  of  an  exponential  random  variable  is 

P{x)  =  Ke-^^  (5.44) 

where  k  here  is  a  free  parameter  pertaining  to  the  exponential  random  variable.  We  show 
some  example  PDFs  of  an  exponential  random  variable  in  Figure  5.19. 


To  derive  Ti  as  a  function  of  e^, 
exponential  pdf  is 


consider  that  the  mean  (expected  value)  of  an 
1 


K 


(5.45) 
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(^)  ^ai  versus  noise  variance  (b)  5I]i=i°  ^ai  versus  noise  variance  cr^ 

Figure  5.18:  (a)  The  maximum  (as  represented  by  Cai)  of  €a  versus  noise  variance 

(b)  The  mean  (as  represented  by  ^ai)  of  €a  versus  noise  variance 


X 


Figure  5.19:  Some  example  PDFs  of  an  exponential  random  variable. 
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The  moment-generating  function  of  an  exponential  pdf  is 


9{t) 


e^^Ke-^^dx 


10 


r*00 


K  I 

Jo 

fO 

K  / 

J  —  oo 

|0 


p(k  — t)x 


K- 


K  —  t 
K 


K  —  t 


(5.46) 


If  we  model  the  histogram  of  €«  as  an  exponential  pdf,  we  can  use  the  Chernoff 

bound 


P[X  >  a] 


< 


Ke 


—at 


K  —  t 


(5.47) 


as  a  tool  to  choose  a  suitable  Ti.  In  (5.47),  the  Chernoff  bound  means  that  the  probability 
for  an  exponential  random  variable  X  to  be  larger  than  a  certain  value  a  is  less  than  or 
equal  to  its  moment-generating  function  9{t)  multiplied  by  [55].  Also,  in  (5.47),  the 
moment-generating  function  is  a  function  of  a  variable,  t.  To  have  the  tightest  bound,  take 


=  0 

di  K  —  t 

—aKe~°‘^{K  —  t)  —  Ke““*(— 1) 

{k  —  tY 

Ke~°‘^  =  aKe““*(K  —  t) 

t  =  K -  (5.48) 

a 

Substituting  the  t  in  (5.47)  with  t  =  k  —  the  tightest  Chernoff  bound  is 


P[X  >  a]  < 


(5.49) 


Comparing  with  (5.47),  (5.49)  is  the  tightest  Chernoff  bound  because  t  =  k  —  ^  is  the  place 
where  is  minimized. 
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Example 

When  =  10“®,  we  can  calculate  the  k  =  377.1124  by  simply  calculating  the 
inverse  of  the  mean  of  the  Cai ,  • ' '  >  ^aeooo  ■  Then  we  have 

•  Let  Ti  =  0.01,  P[X  >  Ti]  <  0.2360 

•  Let  Ti  =  0.02,  P[X  >  Ti]  <  0.0109 

•  Let  Ti  =  0.03,  P[X  >  Ti]  <  3.75  x  10“^ 

•  Let  Ti  =  0.04,  P[X  >  Ti]  <  1.15  x  10“^ 

•  Let  Ti  =  0.05,  P[X  >  Ti]  <  3.31  x  10“^ 

This  means  that  at  the  particular  scale  and  noise  variance  =  10“®),  if  we  choose 
Ti  =  0.01,  it  is  not  a  good  upper  bound  because  there  is  a  probability  of  0.236  that  some  Ca 
will  be  greater  than  Ti.  However,  at  the  same  scale  and  noise  variance,  Ti  =  0.05  would  be 
a  suitable  choice  because  the  probability  of  Cq  >  Ti  is  fairly  small  (~  3.31  x  10“^).  However, 
it  does  not  imply  that  choosing  a  large  Ti  will  always  be  good,  as  illustrated  in  Figure  5.6. 
We  have  to  choose  a  Ti  that  is  large  enough  for  P[X  >  Ti]  to  be  small,  but  not  too  large  to 
raise  the  overall  system  error  rate  higher.  We  could  choose  a  reasonably  small  P[X  >  Ti] 
so  that  we  are  confident  in  saying  that  Ti  is  almost  at  the  top  of  all  e^.  In  this  example,  we 
choose  P[X  >  Ti]  <  1.0  x  10“^,  and  calculate  the  corresponding  Ti  =  0.0262  using  (5.49). 
We  present  an  algorithm  for  choosing  Ti  in  Table  5.4. 


Table  5.4:  An  algorithm  to  choose  Ti 

1.  Simulate  a  sensor  network  localization  process,  including  malicious  nodes 

2.  Collect  all  e  for  all  triples 

3.  Calculate  k  in  (5.49)  by  inverting  the  mean  of  the  €«:  k  =  i  L — 

where  Na  is  the  number  of  available 

4.  Set  a  reasonably  small  probability  of  P[X  >  Ti].  For  example,  P[X  >  Ti]  <  1.0  x  10“^ 

5.  Calculate  the  Ti  corresponding  to  P[X  >  Ti]  <  1.0  x  10“^  using  (5.49) 
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5.4.3  Choice  of  T2 

Similar  to  the  role  Ti  plays,  T2  controls  the  place  where  the  sigmoid  function  in 
(5.8)  equals  0.  Hence  we  can  choose  T2  as  a  function  of  x  in  (5.8).  However,  note  that  x  in 
(5.8)  measures  the  relative  contribution  from  this  particular  node  i  to  the  total  discrepancy. 
Since  there  are  three  nodes  in  a  triple,  and  there  is  no  reason  for  us  to  assume  that  one 
node  is  more  important  to  another,  setting  T2  to  ^  makes  the  most  sense,  because  choosing 
any  value  other  than  T2  =  |  is  equal  to  saying  that  node  i  (which  is  malicious)  is  more 
important  than  nodes  j  or  k. 


5.4.4  Choise  of  a  I  and  0:2 

We  perform  secure  localization  in  a  seven-node  system  with  two  malicious  nodes 
using  the  following  parameters 

1.  cj2  =  1.0  X  10“® 

2.  Ti  =  0.5 

3.  r2  =  i 

Q?!  and  a2  merely  control  the  slope  of  the  sigmoid  functions,  and  have  little  impact  on  the 
performance  of  the  relaxation  labeling  algorithm.  This  can  be  demonstrated  by  the  failure 
rate  of  detecting  malicious  nodes.  We  adjust  different  values  of  ai,  and  the  result  demon¬ 
strates  that  with  a  suitable  choice  of  Ti  and  T2,  the  failure  rate  is  consistently  0  at  low 
noise  level.  The  same  outcome  is  obtained  for  a2-  (The  graphs  are  not  shown  since  they 
are  both  0). 

For  completeness,  we  report  the  effect  of  ai  and  02  at  higher  noise  variances. 
Again,  we  perform  secure  localization  in  a  seven-node  system  with  two  malicious  nodes 
using  the  following  parameters 

1.  cj2  =  1.0  X  10“^ 

2.  Ti  =  0.5 

3.  r2  =  i 
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Figure  5.20  shows  the  system  error  rate  with  respect  to  different  values  of  ai  and  a2-  As 
we  have  expected,  ai  and  a2  do  not  effect  the  system  performance. 
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Figure  5.20:  Effect  of  ai  and  a2  on  the  system  performance.  We  choose  a  higher  noise 
variance,  ci^  =  1.0  x  10“^  in  this  figure.  There  is  little  or  no  effect  on  the  system  performance 
while  the  values  of  a\  and  02  are  being  changed. 


5.5  Discussions  on  the  Speed  of  Convergence 

In  this  section,  we  aim  to  address  the  following  question.  How  is  the  speed  of 
convergence  of  a  relaxation  labeling  process  affected  by  the  total  number  of  nodes  in  the 
network? 


One  way  to  address  this  issue  is  from  Appendix  C  of  this  dissertation,  which  for¬ 
mulates  relaxation  labeling  as  a  gradient  descent  method.  We  briefly  review  the  related 
parts  of  Appendix  C  here. 

Classical  relaxation  labeling  updates  the  probability  as  follows 

‘  '  ■’'  Et  n'Oi)  [1  +  <i‘(A*)l 

We  define  Qi{X)  =  [l  +  and  (5.50)  becomes 


(5.50) 
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(5.51) 


EkPn>^k)Ql{Xk) 

The  relaxation  labeling  process  in  (5.51)  can  be  expressed  as  a  gradient  method 


P‘*'{X)  =  P‘{X)  +  4,s’(\) 


(5.52) 


where 


and  (j)  =  1. 


slW  = 


^/(A)  m>^)-EkPH^k)Qi{Xk)] 
EkPn^k)Qi{Xk) 


(5.53) 


The  term  s((A)  in  (5.52)  is  indeed  the  rate  of  change  for  T)(A) 


s‘(A)  =  P‘+HX)  -  PHX)  = 


dPiW 

dt 


(5.54) 


Therefore,  if  Si(A)  is  larger,  T’j(A)  will  converge  faster  (and  vice  versa) 


5.5.1  Speed  of  Convergence 

We  will  define  the  number  of  nodes  in  the  network  as  n,  and  the  number  of  ma¬ 
licious  nodes  in  the  network  as  rim-  Hence  the  ratio  of  malicious  nodes  in  the  network  is 
rim/n.  In  this  section,  we  relate  s*(A),  the  term  that  controls  the  speed  of  convergence,  to  n. 

In  this  dissertation,  we  have  only  two  labels,  m  and  b.  Without  loss  of  generality, 
we  look  at  Si{b)  first.  Equation  (5.53)  becomes 


pm  [Qlib)  -  Pimm  -  P!{m)Qj{m)] 

P!ib)Ql{b)  +  Pl{m)Ql{m) 

p^mm  -  pm  [pmQm + pm)Qim] 

P*{b)Ql{b)  +  Pl{m)Qj{m) 

Pimm  _ 
pimm+piimiim)  ^ 

_ i _ ppb) 

PHPQliP) 


(5.55) 

(5.56) 

(5.57) 

(5.58) 
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In  (5.58),  we  need  to  identify  those  terms  related  to  n.  The  terms  and  P^b) 

change  with  respect  to  time  steps,  hence  we  assume  them  to  be  constants,  not  functions  of 
n.  Rather,  Q\{'m)  and  Q\{b)  will  be  changed  once  n  changes. 

5.5.2  Speed  of  Convergence  vs.  the  Total  Number  of  Nodes 

In  order  to  analyze  s\{b)  (or  s\{m),  for  that  matter)  as  n  changes,  we  consider  a 
specific  case  of  the  sensor  network.  Consider  a  sensor  network  of  n  nodes,  and  each  node 
is  labeled  as  1, 2, . . . ,  n.  Node  1  is  a  malicious  node,  while  nodes  2, . . . ,  n  are  all  benign  nodes. 

Now  we  add  an  extra  benign  node  into  the  network.  That  is,  node  1  is  malicious, 
nodes  2, . . . ,  n  + 1  are  all  benign.  We  denote  the  Q\{h)  when  the  network  size  is  n  as  Q\{b)  . 

n 

Consider  node  1  (which  is  malicious)  first.  The  rate  of  convergence  for  Pi{b)  is 
controlled  by  s\{b),  which  is  controlled  by  Since  we  know  in  advance  that  node  1  is 

malicious,  we  know  that,  as  Pl{b)  converges,  R((6)  <  Pl{m).  Hence  we  have 

Q\{b)  <  Q\{m)  (5.59) 

n  n 

By  definition,  we  have  the  following  equation 

Q{{m)  =  1+  ,  ^  Un  -  l){n  -  2){Q\{m)  -1) 

n+1  TI[TI  —  Ij  ^  ^ 

+  E  E  E  ^n+i(A")r(l,  n  +  1,  m.  A',  A") 

i  A'  A" 

+  '^'^Pn+i{>^)'^Pj{)^')r{l,n+l,i,m,>!'  (5.60) 

i  A'  A" 

In  (5.60),  we  know  that  node  1  is  malicious,  and  the  rest  of  the  nodes  are  benign. 
Hence  as  the  system  approaches  convergence,  only  Pi{m)  and  Pi{b),  i  =  2, . . .  ,n  +  1  are 
closer  to  1.  Other  probabilities  for  other  possible  labels  are  small,  or  even  close  to  0.  For  the 
purpose  of  analysis,  we  will  omit  those  terms  (this  is,  in  spirit,  similar  to  Taylor  series  where 
we  drop  higher-order  terms  which  are  much  smaller).  Hence  (5.60)  can  be  approximated  as 
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Q\{m) 


—  In - 

n+i  n(n  —  1) 


\{n-l){n-2){Q{{m)  -  1) 


+  Pj{b)Pn+i{b)r{l,i,  n  +  1,  m,  b,  b) 
i 

+  Pn+i{h)Pj(h)r{l,n  +  l,z,m,  6,  6)| 


(5.61) 


The  terms  r(l,i,n  +  l,m,b,b)  in  (5.61)  are  positive  numbers  Vi  /  l,(n  +  1). 


n+1 


aQ\{m) 


Equation  (5.61)  can  be  simplified  in  the  form  of  Q\{m) 
b  are  both  positive  numbers.  Hence  we  have 


Q\{m)\  >Q{{m) 

In+l 

In  a  similar  manner,  we  can  have 


Q{{b)  <  Q{{b) 

n+1 


Moreover,  the  fact  the  node  1  is  malicious  still  holds,  hence  we  have 


+  b,  where  a  and 


Q\{b)  <  Q\{m) 


n+1 


n+1 


From  (5.59),  (5.62),  (5.63)  and  (5.66),  we  have 


s\{m)  >  s\{m) 

n+1 


(5.62) 


(5.63) 


(5.64) 


(5.65) 


which  means  that  the  speed  of  convergence  should  be  faster  as  we  introduce  one  more  benign 
node.  Let  us  take  a  numerical  example,  suppose  that  ~  we  introduce  more 

benign  nodes,  let  us  say  that  becomes  Since  ^  we  have  <  jq+r- 

Hence  we  know  that  the  speed  of  convergence  will  be  be  faster.  We  illustrate  the  following 
equation  in  Table  5.5 


s{{m)  ~ 


1 


1  ,  Q\(b) 


(5.66) 


Q{(m} 


We  only  select  a  few  decreasing  values  of  Q\{b)  and  increasing  values  of  Q\{m)  in  Table 
5.5.  As  Table  5.5  illustrates,  r\{m)  will  go  up  as  Q\{b)  decreases  and  Q\{m)  increases. 


For  the  purpose  of  analysis,  we  consider  only  the  case  where  there  is  one  malicious 
node  inside  the  network.  When  the  sensor  network  has  a  different  number  of  malicious 
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Table  5.5:  Example  values  of  (5.66). 


Q\{b) 

Q\{m) 

s\{m) 

0.5 

0.5 

0.5 

0.4 

0.6 

0.6 

0.3 

0.7 

0.7 

0.2 

0.8 

0.8 

0.1 

0.9 

0.9 

nodes,  it  becomes  difficult  to  analyze.  Qi(A)  is  a  collection  of  compatibility  functions, 
r*(A).  The  value  that  r*(A)  gives  depends  on  the  actual  triples  and  how  they  perform  the 
localizations.  It  is  difficult  to  predict,  as  n  gets  larger,  whether  a  Q((A)  will  be  larger  or 
smaller  simply  because  it  has  more  rl(A)  in  the  summation. 

5.5.3  Experiments 

We  simulate  a  sensor  network  of  n  nodes,  in  which  only  the  first  node  is  malicious. 
We  then  record  the  number  of  iterations  required  to  converge.  After  we  repeat  such  experi¬ 
ments  for  10  times,  we  average  the  number  of  iterations  and  show  them  in  Figure  5.21.  We 
can  see  that  the  system  converges  faster  as  n  gets  larger.  This  agrees  with  our  analytical 
solution  in  (5.65). 

However,  the  speed  of  convergence  will  not  get  larger  as  long  as  the  network  grows 
to  a  certain  size.  Recall  that,  by  definition,  0  <  Q\{\)  <  1.  Hence  in  this  example,  Q\{m) 
is  getting  closer  to  1,  while  Q\{b)  is  getting  smaller  and  closer  to  0.  Let  us  consider  a 
numerical  example.  Let  us  say  that  ^t?’\  =  ^.  As  the  network  size  gets  larger, 

L^-^yTTL)  Ll.y  {^-^yTTL) 

becomes  ,  which  is  not  much  different  from  ^ .  Hence  the  speed  of  convergence  will  not 
indefinitely  increase  as  the  network  size  grows. 
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Figure  5.21:  Number  of  iterations  required  to  reach  convergence  at  various  number  of 
network  sizes.  We  can  see  that  the  system  converges  faster  as  n  gets  larger.  However,  the 
speed  of  convergence  does  not  get  any  larger  as  the  network  reaches  a  certain  size  (20  in 
this  experiment). 
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Chapter  6 

Secure  Tracking  Using  Relaxation 
Labeling 


Our  algorithm  can  successfully  detect  malicious  nodes  in  event  localization  scenar¬ 
ios.  Now  we  design  and  extend  the  relaxation  labeling  algorithm  for  secure  tracking  issues. 
For  background  on  target  tracking,  please  refer  to  Chapter  3. 

6.1  Related  Work 

There  is  little  or  no  literature  on  the  topic  of  secure  tracking  in  sensor  networks. 
Capkun  et  al.  [6]  propose  a  protocol  to  securely  verify  the  time  of  encounters  in  multi-hop 
networks.  However,  Capkun  et  al.  [6]  did  not  address  Bayesian  tracking  as  defined  by  (3.1) 
and  (3.2).  Besides,  the  work  in  [6]  is  based  on  an  ad  hoc  network,  which  is  different  from 
our  centralized  setting. 

Our  work  uses  relaxation  labeling  to  identify  the  malicious  nodes  in  the  sensor 
network,  and  use  only  information  from  benign  nodes  to  build  the  particle  filter  to  perform 
target  tracking  [10].  Both  the  problem  definition  and  the  proposed  algorithm  are  novel  [10]. 

6.2  Algorithm  to  Detect  Malicious  Nodes  while  Tracking 

We  propose  a  secure  tracking  algorithm  based  on  relaxation  labeling  algorithms  [51, 
66,  67,  52,  31,  42].  Whereas  for  localization,  at  least  three  active  nodes  are  required  to  lo- 
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calize  an  event;  for  tracking,  fewer  may  be  used.  For  example,  in  [47],  Liu  et  al.  propose  to 
perform  target  tracking  by  activating  one  node  at  a  time.  However,  for  security  purposes, 
we  activate  two  or  more  nodes  at  each  time  step  k  in  the  sensor  networks.  With  such  redun¬ 
dancy,  any  inconsistency  in  the  behavior  of  nodes  can  be  exploited.  Besides,  activating  more 
nodes  at  each  time  step  allows  us  to  remove  some  portion  of  them  that  are  deemed  malicious. 

We  begin  the  explanation  of  our  algorithm  by  considering  the  case  where  we  ac¬ 
tivate  two  nodes  at  each  time  step,  as  illustrated  in  Figure  6.1.  The  cases  in  which  we 
activate  three  or  more  nodes  will  be  discussed  later  in  this  section.  As  defined  in  Section 
3.2,  we  assume  that  each  sensor  node  has  adequate  processing  power,  and  can  calculate  the 
current  estimate  of  the  target  location  based  on  previous  estimates.  Let  us  begin  with  time 
step  t  =  0,  when  the  initial  position  of  the  target,  p(x'^),  is  assumed  to  be  known.  p(x®)  is 
passed  to  the  two  nodes  (nodes  1  and  2)  activated^  at  time  step  t  =  1.  After  calculating 
p{x^\z^)  using  particle  filter  algorithms,  each  of  the  two  nodes  at  t  =  1  will  pass  p{x^\z^)  to 
both  of  the  two  nodes  at  t  =  2  (nodes  3  and  4).  Based  on  the  two  different  p(x^|z^)  from 
f  =  1  given,  the  two  nodes  at  t  =  2  will  each  produce  two  different  p(x^|2;^).  For  example, 
in  Figure  6.1,  node  3  receives  p(x^|2;^)  from  both  nodes  1  and  2,  so  it  can  calculate  two 
different  p(x^|z^)  based  on  them.  What  if,  in  Figure  6.1,  that  node  1  is  malicious,  and  node 
2  is  benign  (assuming  that  node  3  is  benign)?  Then  the  two  estimates  of  p(x^|2:^)  calculated 
by  node  3  could  be  drastically  different  since  the  p{x^\z^)  reported  by  node  1  is  already 
different  from  the  p(x^|2:^)  by  node  2. 

Following  such  logic,  we  design  our  relaxation  labeling  algorithm  based  on  sets  of 
three  nodes,  which  we  again  denote  as  triples.  There  are  three  types  of  triples 

1.  Type  I  :  a  triple  consisting  of  two  predecessor  nodes  and  one  successor  nodes.  For 
example,  nodes  (1,2,3)  and  (1,2,4)  in  Figure  6.1 

2.  Type  II:  a  triple  consisting  of  one  predecessor  node  and  two  successor  nodes.  For 
example,  nodes  (1,3,4)  and  (2,3,4)  in  Figure  6.1 

3.  Type  III:  a  triple  consisting  solely  of  nodes  at  the  same  time  instant.  For  example, 
nodes  (1,2,3)  and  (4,5,6)  in  Figure  6.5 

^We  use  the  node  activation  algorithm  discussed  in  Section  3.1.3. 
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Figure  6.1:  At  t  =  0,  we  assume  P(x^)  is  known,  and  this  information  is  passed  to  the  two 
nodes  activated  at  t  =  1.  Using  the  particle  filter  algorithm,  node  1  and  node  2  can  each 
calculate  p{x^\z^)^  and  those  information  is  passed  to  node  3  and  node  4.  Both  node  3  and 
node  4  have  two  inputs,  hence  they  will  each  produce  two  distinctive  p(x^|2:^).  Note  that  in 
this  model,  each  node  (e.g.  node  3)  reports  BOTH  of  its  estimates  to  the  central  processor. 


In  our  algorithm,  the  nodes  in  a  Type  III  triple  do  not  pass  information  to  each 
other,  so  there  is  no  way  to  exploit  their  inconsistency.  We  will  only  use  Type  I  and  Type 
II  triples  in  our  algorithm.  As  a  result,  we  will  design  two  different  compatibility  functions 
for  these  two  types  of  triples. 


Note  that  the  three  Types  are  mutually  exclusive  and  collectively  exhaustive  for 
triples.  However,  a  single  node  could  belong  to  one  or  more  Types.  For  example,  in  Figure 
6.1,  node  1  belong  to  both  (1,  2,3),  a  Type  I  triple,  and  (1,  3, 4),  a  Type  H  triple. 

When  malicious  nodes  collude  in  a  localization  scenario,  as  in  Chapter  2  and  5,  they 
report  ranges  which  are  distances  from  the  fictitious  event  position  to  the  node  positions. 
In  a  tracking  scenario,  nodes  report  a  PDF,  hence  the  way  they  collude  is  to  report  PDF’s 
whose  means  lies  on  the  fictitious  target  location,  as  defined  in  Chapter  3. 

6.2.1  Type  I  Triples 

In  this  section,  we  focus  on  Type  I  triples  and  how  we  define  the  compatibility 
function  for  them.  In  Figure  6.2,  we  illustrate  a  Type  I  triple,  and  we  denote  the  predecessor 
nodes  as  gl  and  g2]  and  the  successor  node  as  s.  For  clarity  of  notation,  explicit  references 
to  time  are  omitted  unless  required  by  context. 


Both  node  gl  and  node  g2  pass  information  to  node  s,  hence  node  s  can  examine 
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Figure  6.2:  Two  predecessor  nodes  passing  information  to  one  successor  node. 


the  difference  between  the  two  beliefs  that  it  produces  and  reports.  We  denote  the  belief 
that  node  s  calculates  based  on  the  belief  of  node  gl  as  pi(x|z).  Similarly,  we  denote  the 
other  belief  that  node  s  calculates  as  p2{x\z).  We  can  quantify  their  difference  by  comparing 
the  expected  values  of  pi{x\z)  and  p2(x|z) 


<  pi(x|2;)  >  -  <  P2{'^\z)  >  II  (6.1) 

j  x[pi{x\z)  -  p2{-x\z)]dx\\  (6.2) 

Let  us  consider  whether  gl,  g2  or  s  is  malicious  or  benign  here.  Since  each  node 
in  a  Type  I  triple  could  have  two  possible  labels:  malicious  or  benign,  there  ought  to  be 
2^  =  8  combinations. 

If  node  s  is  malicious,  then,  regardless  of  what  were  passed  to  it  from  its  prede¬ 
cessors,  node  s  will  report  two  p(x\z),  both  of  which  have  means  that  fall  on  the  fictitious 
target  location.  Hence  d  should  be  close  to  0  if  node  s  is  malicious. 

If  node  s  is  benign,  there  are  three  cases: 

1.  One  of  its  predecessor  is  malicious  (the  other  benign) 

2.  Both  nodes  gl  and  g2  are  malicious 


d  = 


where  ||  •  ||  denotes  the  2-norm. 


3.  Both  nodes  gl  and  g2  are  benign 
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If  node  s  is  benign,  and  one  of  its  predecessors  is  malicious,  the  two  outputs  from 
node  s  would  disagree  on  each  other.  Hence  we  expect  d  to  be  large  in  case  1.  However, 
cases  2  and  3  are  equivalent,  since  in  case  2  the  two  malicious  nodes  are  colluding  and  both 
point  to  the  fictitious  location.  Hence  in  both  case  2  and  case  3,  we  expect  d  to  be  small. 
We  list  all  the  possible  cases  for  a  Type  I  triple  in  Table  6.1.  Note  that  in  Figure  6.1,  the 
behaviors  of  Table  6.1  apply  to  nodes  (1,2,3)  and  (1,2,4).  A  formulation  for  “small”  and 
“large”  will  be  presented  in  Section  6.2.3. 


Table  6.1:  Expected  behaviors  in  d  for  a  Type  I  triple 


Predecessor  1 

Predecessor  2 

Successor 

Behavior 

malicious 

malicious 

malicious 

d  ~  0 

malicious 

malicious 

benign 

d  small 

malicious 

benign 

malicious 

d  ~  0 

malicious 

benign 

benign 

d  large 

benign 

malicious 

malicious 

d  ~  0 

benign 

malicious 

benign 

d  large 

benign 

benign 

malicious 

d  ~  0 

benign 

benign 

benign 

d  small 

6.2.2  Type  II  Triples 

In  a  Type  H  triple,  one  predecessor  node  will  pass  its  p{:x.\z)  to  two  different  suc¬ 
cessors.  We  highlight  such  scenario  in  Figure  6.3.  We  call  the  predecessor  node  g,  and  the 
two  successors  node  si  and  s2  in  this  section.  Assuming  that  node  g  is  benign,  and  one 
of  the  successors  is  malicious  (and  the  other  benign),  the  outputs  from  the  two  successor 
nodes  would  be  different.  Again,  we  can  exploit  the  inconsistency  whenever  the  nodes  are 
behaving  differently. 

We  denote  (for  Type  H  triples)  the  belief  that  node  si  calculates  as  pi{x\z),  and 
it  has  a  different  meaning  from  the  pi(x|2:)  defined  for  Type  I  triples.  Similarly,  the  belief 
that  node  s2  calculates  is  denoted  as  p2{x\z).  Then  we  can  calculate  the  difference  between 
pi{x\z)  and  p2(x|2:)  using  (6.1). 


Regardless  of  whether  node  g  is  malicious  or  not,  as  long  as  one  successor  is 
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Figure  6.3:  One  predecessor  node  passing  information  to  two  successor  nodes. 

malicious  and  the  other  is  not,  we  expect  d  to  be  large.  What  if  both  of  node  si  and  node 
s2  are  benign?  Since  they  are  given  the  same  input  from  node  g,  their  outputs  should  be 
similar  because  they  are  both  benign  nodes.  Hence  we  expect  d  to  be  small  when  both  of 
node  si  and  node  s2  are  benign.  If  both  of  node  si  and  node  s2  are  malicious,  we  expect 
d  ~  0,  since  no  matter  what  their  inputs  are,  they  will  collude  in  their  reports.  We  list  all 
8  possible  behaviors  in  Table  6.2. 


Table  6.2:  Expected  behaviors  in  d  for  a  Type  II  triple 


Predecessor 

Successor  1 

Successor  2 

Behavior 

malicious 

malicious 

malicious 

d  ~  0 

malicious 

malicious 

benign 

d  small 

malicious 

benign 

malicious 

d  small 

malicious 

benign 

benign 

d  small 

benign 

malicious 

malicious 

d  ~  0 

benign 

malicious 

benign 

d  large 

benign 

benign 

malicious 

d  large 

benign 

benign 

benign 

d  small 

6.2.3  Algorithm  Details 

The  expected  behaviors  in  Table  6.1  and  6.2  are  the  core  of  our  relaxation  labeling 
algorithm.  We  define  a  compatibility  function,  r,  based  on  d.  In  Table  6.1  and  6.2,  if  d  is 
expected  to  be  small  or  close  to  0,  we  have 


r  =  2e““‘^  -  1 


(6.3) 
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On  the  other  hand,  if  d  is  expected  to  be  large,  we  will  use  this  compatibility  function 


r  = 


1  _|_  g-a{d-l3) 


-  1 


(6.4) 


where  in  (6.3)  and  (6.4),  a  and  (3  are  parameters.  Note  that  the  compatibility  function  r 
will  always  return  a  value  between  -1  and  1.  The  higher  the  value  that  r  returns,  the  more 
compatible  the  3  nodes  are.  For  example,  in  Figure  6.1,  if  we  assume  node  1  is  malicious, 
node  2  is  benign,  and  node  3  is  malicious,  then  we  expect  d  ~  0,  as  in  Table  6.1.  We  then 
calculate  r  using  —  1.  If  the  assumption  that  node  1  is  malicious,  node  2  is  benign, 

and  node  3  is  malicious  is  indeed  true,  then  the  empirical  data,  d,  should  lead  r  to  return 
a  value  close  to  1.  Otherwise,  it  should  return  a  negative  number  (which  is  larger  than  -1). 


We  illustrate  some  parameter  settings  of  (6.3)  and  (6.4)  in  Figure  6.4.  In  Figure 
6.4(a),  we  expect  d  to  be  small  for  a  correct  labeling,  hence  a  small  d  will  lead  r  to  return  a 
value  close  to  1.  On  the  other  hand,  we  expect  d  to  be  large  in  Figure  6.4(b),  hence  a  large 
d  will  make  r  return  a  value  close  to  1. 


(a)  (6.3)  (b)  (6.4) 

Figure  6.4:  Illustration  of  some  parameter  settings  of  (6.3)  and  (6.4) 

So  if  we  assume  node  1  is  malicious,  how  is  that  compatible  with,  say,  node  2 
being  malicious  and  node  3  benign?  How  about  node  1  being  malicious,  node  2  and  3  both 
being  benign?  In  each  possible  case,  we  can  calculate  the  compatibility  function  r  using 
(6.3)  or  (6.4).  We  denote  malicious  as  m  and  benign  as  b,  and  we  define  a  function  q  which 
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accumulates  all  the  effects  of  labeling  node  i  as  A 

(6.5) 

i  k  A'  A" 

where  N  =  (n  —  l)(n  —  2),  n  is  the  number  of  nodes  in  the  network,  j  =  1, n,  k  =  1, 
j  ^  i,  k  ^  i,  j  ^  k,  and  P(-)  is  a  confidence  function  to  be  defined  and  explained  later. 
As  before,  in  (6.5),  g((A)  is  a  way  to  tally  all  the  possibilities  when  we  label  node  i  as  A, 
label  node  j  as  X'  and  label  node  k  as  X" .  Note  that  in  (6.5),  t  is  the  iteration  number,  and 
this  superscript  on  the  right-hand  side  has  been  omitted  for  clarity.  Furthermore,  in  (6.5), 
we  exclude  all  the  cases  that  node  i,  j  and  k  are  at  the  same  time  step  in  the  tracking  process. 

Finally,  we  define  confidence  P{X)  as  in  Chapter  5.  The  confidence  of  node  i 
having  label  Xj  is  denoted  as  Pi{Xj),  and 


and 


remain  the  same. 


0  <  Pi(Aj)  <  1  Vi,j 


E^.(A,)  =  1. 

j 


(6.6) 

(6.7) 


Note  that  in  (6.7),  we  only  have  two  labels,  hence  Xj  G  {m,b}.  Being  labeled  as 
m  means  that  this  node  is  malicious,  and  being  labeled  Xh  means  benign.  Following  [51], 
we  will  iteratively  update  the  confidence  of  node  i  having  label  j  as 


P/Oi)  [1  +  .;.‘(A,)] 


(6.8) 


where  Dj  =  Pi(Afc)  [l  -|-  (?((Afc)]  is  a  normalization  required  to  ensure  that  PfiXj)  sums 
to  1,  and  t  stands  for  iteration.  If  P^i'm)  converges  to  1,  while  P^ifi)  converges  to  0,  then 
node  2  is  found  to  be  malicious. 


In  Figure  6.1,  we  apply  the  relaxation  labeling  process  to  nodes  1,  2,  3  and  4.  If, 
say,  Pi{m)  approaches  1  after  certain  iterations  in  the  relaxation  labeling  process,  then  we 
determine  node  1  to  be  malicious.  After  the  relaxation  labeling  process,  we  remove  mali¬ 
cious  nodes,  and  average  the  p{^\z)  of  benign  nodes  at  t  =  2.  The  averaged  p{x\z)  is  passed 
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on  to  the  two  nodes  active  at  t  =  3.  The  two  nodes  at  t  =  3  will  each  calculate  p{x.\z),  and 
pass  them  on  to  the  two  nodes  at  t  =  4.  We  then  use  the  four  nodes  at  t  =  3,4  to  perform 
relaxation  labeling.  After  the  malicious  nodes  are  detected,  we  average  the  correct  results 
from  benign  nodes,  and  produce  a  single  p{x\z)  to  transmit  to  the  two  nodes  at  t  =  5.  The 
process  repeats  afterwards. 

It  is  also  possible  to  activate  more  nodes  at  each  time  step.  What  is  more,  if  we 
have  more  than  3  nodes,  we  can  not  only  do  tracking,  but  also  localization  of  the  target. 
This  will  provide  added  security  mechanism  using  the  secure  localization  algorithms.  Fig¬ 
ure  6.5  illustrates  the  case  when  we  activate  three  nodes  at  each  time  step  in  tracking. 
We  will  examine  any  combination  of  3  nodes  in  the  relaxation  labeling  process,  except  the 
combination  of  nodes  (1,2,3)  or  nodes  (4,5,6).  If  we  select  two  predecessor  nodes  and 
one  successor  node,  we  should  use  the  compatibility  functions  as  in  Table  6.1.  Similarly,  if 
the  3  nodes  that  we  choose  have  one  predecessor  and  two  successors,  we  will  use  the  com¬ 
patibility  function  in  Table  6.2.  The  algorithm  is  summarized  in  Table  6.3  and  in  Figure  6.6. 


Figure  6.5:  We  can  activate  three  nodes  at  each  time  step  during  the  tracking  process.  Each 
successor  node  will  have  three  inputs,  hence  it  can  produce  three  different  outputs.  The 
inconsistency  between  its  three  outputs  can  be  used  in  the  relaxation  labeling  process. 
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Table  6.3:  Secure  tracking  algorithm  using  relaxation  labeling 


time  t  =  A: 


time  t  =  A:  +  1 


time  t  =  A:  +  2 


Receive  p{xk-i\zk-i) 

(if  A:  ==  1,  p{xq\zo)  =  p(xo)  is  known) 

Activate  n  nodes 

Node  i  calculates  p^{xk\zk) 

Activate  n  nodes 

Each  node  receives  p^{xk\zk),  i  =  1, . . .  ,n 
Node  j  calculates  p^ {xk+i\zk+i),  j  =  1, . . .  ,n 
Use  relaxation  labeling  algorithm 
Remove  malicious  nodes 
Average  the  results  to  obtain  p{-x.k+i\zk+i) 

Go  to  time  t  =  k]  use  the  same  algorithm  except 

k  is  replaced  with  {k  +  2) 

p{xk-i\zk-i)  is  replaced  with  p{xk+i\zk+i) 


Figure  6.6:  Illustration  of  the  relaxation  labeling  algorithm.  The  rectangular  box  stands  for 
relaxation  labeling  algorithm.  At  each  time  step  (except  time  0),  we  activate  3  nodes.  For 
every  3  nodes  (except  time  0) ,  we  perform  relaxation  labeling  algorithm  to  detect  malicious 
nodes.  After  removing  malicious  nodes  and  average  the  results  from  benign  nodes,  the 
relaxation  labeling  algorithm  produces  a  correct  result  and  pass  it  on  to  the  next  time  step. 
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6.3  Experimental  Results 

In  this  section,  we  demonstrate  results  of  relaxation  labeling  for  secure  tracking. 
The  target  in  our  experiments  is  traveling  in  a  two-dimensional  space.  That  is,  the  state 
vector  is  X  =  [x,  y,  x,  y]^.  We  begin  our  experiments  with  a  linear  motion  model  of  the 
target  as 

+  [0.25,  0.25,  0,  0]*  +  (6.9) 

where  is  the  process  noise  and  the  initial  state  is  x  =  [0.1, 0.2]*.  In  (6.9),  we  assume 
that  the  target  is  traveling  at  a  constant  velocity.  This  assumption  is  for  the  simplicity 
of  our  simulations,  and  such  restrictions  can  be  relaxed.  For  example,  in  [64],  Zhou  et  al. 
proposed  a  linear  motion  model  in  which  both  the  velocity  and  noise  variance  are  updated 
at  each  time  step.  Another  issue  worth  of  note  is  that,  although  (6.9)  is  linear  and  looks 
simple,  in  modeling  real-world  applications  such  as  tracking  objects  in  video  sequences  [64] 
a  linear  model  does  a  fairly  good  job. 

Our  sensor  model  is 


tc*  is  the  measurement  noise. 


(6.10) 


To  choose  suitable  variances  for  the  process  and  measurement  noises,  consider  the 
evolution  of  the  state  sequences,  x,  over  a  certain  period  of  time  steps.  For  example,  here 
we  choose  time  steps  t  =  0, 1,  •  •  •  ,  50.  At  t  =  0,  the  target  starts  at  the  origin  of  our 
coordinate  system,  that  is,  it  is  given  an  initial  position  of  x  =  [0.1, 0.2]*.  We  confine  our 
observations  to  be  within  this  50-time-step  window,  because  such  a  window  is  wide  enough 
to  demonstrate  the  dynamics  of  the  target  maneuvering  and  to  show  the  effectiveness  of  our 
algorithm.  The  assumption  of  the  initial  position,  x  =  [0.1,  0.2]*,  is  also  reasonable  because 
although  the  target  is  constantly  maneuvering,  we  can  always  choose  any  time  step  (during 
the  path  of  the  target)  as  t  =  0  and  that  particular  target  location  as  the  origin  of  our 
coordinate  system. 

The  initial  position  of  the  target  is  x  =  [0.1, 0.2]*,  and  it  is  traveling  with  a  con- 
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stant  velocity  of  x  =  [0.25,0.25].  After  50  time  steps,  therefore,  it  will  be  at  a  position 
around  x  =  [12.6, 12. 7]^  Adding  the  influence  of  noise,  we  expect  that  the  position  of 
the  target,  at  t  =  50,  will  generally  have  a  range  of  0  <  xi,X2  <  15.  Hence  we  choose  an 


isotropic  covariance  matrix  for  the  disturbance,  v,  to  be  Q  = 


0.15  0 

0  0.15 


To  choose  an 


appropriate  noise  variance,  we  also  observe  from  (6.10)  that  the  measurement  is  obtained 


by 


50 


-,  where  jjx^  —  is  the  distance  from  the  current  target  position  to  the  node 


position.  We  use  the  node  activation  algorithm  in  Section  3.1.3  to  activate  nodes,  hence 
the  activated  nodes  are  fairly  close  to  the  latest  target  position.  Hence  we  can  assume  that 
0  <  llx^  —  £,  ||  <  2,  and  0  <  „  „  <  30.  So  we  choose  the  variance  of  the  measurement 

II  s*ii  ,  llx^-Cill 

noise  to  be  3.0,  although  it  is  trivial  to  vary  in  simulations. 


6.3.1  Tracking  with  Multiple  Sensor  Nodes 

We  begin  our  experiments  with  target  tracking  by  activating  multiple  (redundant) 
sensor  nodes.  The  tracking  result  in  a  two-dimensional  space  using  three  sensor  nodes  is 
shown  in  Figure  6.7.  Note  that  the  three  sensor  nodes  act  independently.  We  denote  the 
nodes  as  si,  S2  and  S3.  The  node  at  t  =  k,  s^,  will  pass  its  estimate,  p{x.^\z^)  to  node  s^^^^ 
only.  Similarly,  s^  will  pass  information  to  s^^^  only,  and  S3  will  interact  with  Sg"*"^  only. 
In  other  words,  the  three  activated  nodes  at  any  time  step  act  independently,  as  shown 
in  Figure  6.7.  The  central  processor  collects  all  of  the  data  from  each  active  sensor  node, 
and  it  performs  the  relaxation  labeling  algorithm  to  detect  malicious  nodes  before  activate 
future  nodes. 

We  can  also  activate  more  sensor  nodes  to  perform  tracking.  For  example,  we 
activate  four  sensor  nodes  in  Figure  6.8  and  five  nodes  in  6.9.  Note  that  in  Figure  6.7, 
Figure  6.8  and  Figure  6.9,  there  is  no  malicious  node  present. 

In  the  simulations  illustrated  in  Figure  6.8  and  Figure  6.9,  all  of  the  nodes  can 
sense  the  target,  and  the  only  difference  between  them  is  the  number  of  nodes  activated. 
Hence  the  node  positions  are  not  shown.  Node  selection  is  shown  in  the  next  section. 
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X  position 


Figure  6.7:  Tracking  of  the  target  in  a  two-dimensional  space,  x,  by  using  three  sensor  nodes. 
We  activate  three  sensor  nodes  at  each  time.  The  three  sensor  nodes  act  independently  to 
perform  tracking. 


X  position 

Figure  6.8:  Tracking  of  the  target  by  using  four  sensor  nodes.  The  experimental  setup  is 
identical  to  Figure  6.7.  The  only  difference  is  that  we  activate  four  nodes. 
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Figure  6.9:  Tracking  of  the  target  by  using  five  sensor  nodes.  This  experiment  is  similar  to 
Figure  6.7  except  that  we  activate  five  nodes. 

6.3.2  Node  Selection  Algorithm 

We  follow  the  node  selection  algorithm  proposed  in  [47]  to  calculate  the  mutual 
information  between  and  .  Here  we  demonstrate  a  target  traveling 

in  a  two-dimensional  space,  as  shown  in  Figure  6.10.  In  the  area  that  the  target  is  likely  to 
pass,  we  have  deployed  500  sensor  nodes  which  are  randomly  located  within  [0, 15]  x  [0,15]. 
At  every  time  step,  we  choose  20  nodes  which  are  closest  to  the  current  (known)  target  lo¬ 
cation,  x^.  Then  for  each  of  the  20  nodes,  we  can  calculate  the  mutual  information  between 
p(xfc+i|^fc)  p[z^~^^\z^)  [47],  and  pick  the  3  nodes  that  have  the  highest  mutual  informa¬ 
tion  values.  Figure  6.10  shows  the  three  nodes  that  have  the  highest  mutual  information 
at  t  =  20,  while  Figure  6.11  shows  the  three  nodes  at  t  =  30.  Recall  from  Section  3.1.3 
that  the  calculation  of  mutual  information  (for  activation  of  the  next  nodes)  is  equivalent 
to  calculating  velocity  implicitly. 

We  enlarge  the  sensor  field  to  [0,15]  x  [0,15]  because  we  are  dealing  with  a  ma¬ 
neuvering  target  in  Chapter  6.  The  target  is  maneuvering  with  a  certain  unpredictable 
disturbance,  so  it  is  better  to  enlarge  the  sensor  field  to  [0, 15]  x  [0, 15]. 

^This  is  explained  in  Section  3.1.3 
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Figure  6.10:  Illustration  of  the  node  activation  results.  The  states  (only  positions  are 
shown)  of  the  target  are  shown  as  diamonds.  At  t  =  20,  20  nodes  are  chosen  to  calculate 
the  mutual  information,  which  are  shown  as  circles.  Among  them,  three  nodes  that  have 
the  highest  mutual  information  will  be  selected  as  active  nodes  at  t  =  21,  which  are  shown 
as  filled  circles.  We  can  see  from  the  trajectory  of  the  target  at  t  =  19,  20  that  the  current 
best  estimate  of  velocity  is  in  the  northwest  direction.  Hence  the  three  nodes  which  are  in 
the  northwest  direction  of  the  target  position  at  t  =  20  are  activated.  This  agrees  with  the 
highest  mutual  information  that  we  calculated. 
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Figure  6.11:  Illustration  of  the  node  activation  algorithm.  This  is  from  the  same  experiment 
on  Figure  6.10,  except  that  the  result  here  is  extracted  at  t  =  30.  Unlike  Figure  6.10,  it  is 
harder  to  see  where  the  target  is  heading  based  on  its  trajectory  at  t  =  28, 29, 30.  Hence 
the  three  nodes  activated  at  t  =  30  do  not  appear  to  fall  on  one  particular  spot  that  the 
target  is  likely  heading. 
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6.3.3  Secure  Tracking  Results 

In  this  section,  we  demonstrate  the  performance  of  relaxation  labeling  in  secure 
tracking,  as  compared  to  averaging  the  results  from  the  multiple  sensor  nodes.  Recall  that 
we  have  activated  several  multiple  sensor  nodes  at  each  time  step,  and  they  act  indepen¬ 
dently  to  perform  target  tracking.  One  method  of  secure  tracking  is  simply  to  average  the 
calculated  p{:x.\z)  over  the  sensor  nodes  at  each  time  step.  We  denote  this  method  as  aver¬ 
aging.  Our  algorithm,  however,  detect  malicious  nodes  first,  and  average  the  results  from 
only  benign  nodes. 

First,  we  activate  three  sensor  nodes  in  Figure  6.12.  Denoting  the  three  nodes  at 
each  time  steps  as  sf,  S2  and  S3,  we  choose  one  node  to  be  malicious  at  every  ten  time 
steps.  In  Figure  6.12,  we  have  s}®,  s|^,  S3®,  s^®  and  s^®  as  malicious  nodes.  Over  the  60 
time  steps,  we  can  average  the  result  of  the  three  paths,  and  calculate  the  mean-squared 
error  (MSE)  with  respect  to  the  true  target  path.  The  MSE  of  such  averaging  algorithm  is 
1.2349  in  Eigure  6.12. 


MSE=  1.2349 


Eigure  6.12:  Adding  malicious  nodes  to  the  sensor  network.  We  choose  sensor  nodes  s}®, 
S2®,  S3®,  S3®  and  si®  to  be  malicious.  In  other  words,  there  is  one  malicious  nodes  (out  of 
three)  at  t  =  10,  20, 30, 40,  50. 
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Next,  we  compare  the  performance  of  using  relaxation  labeling  and  averaging 
(without  relaxation  labeling).  Figure  6.13  shows  the  result  of  activating  three  sensor  nodes. 
By  using  relaxation  labeling  to  remove  malicious  nodes,  the  MSE  decreases  to  0.4938  in 
Figure  6.13. 


Figure  6.13:  Comparison  of  relaxation  labeling  and  averaging.  The  solid  line  is  obtained  by 
averaging  the  three  paths  in  Figure  6.12.  The  dotted  line  is  obtained  by  removing  malicious 
nodes  using  relaxation  labeling. 

Next,  we  activate  four  nodes  at  each  time  step,  as  shown  in  Figure  6.14.  The  ma¬ 
licious  nodes  are  s}®,  si®  and  sf®.  The  MSE  over  the  60  time  steps  is  0.7090,  which 

is  better  than  the  MSE  in  Figure  6.12.  This  comes  at  the  price  of  activating  more  sensor 
nodes  and  utilizing  more  system  resources.  The  result  using  relaxation  labeling  to  remove 
malicious  nodes  is  shown  in  Figure  6.15.  We  can  see  that  the  MSE  by  using  relaxation 
labeling  has  been  decreased  to  0.3669. 

In  a  similar  manner,  we  activate  five  nodes  in  Figure  6.16.  The  result  of  using 
relaxation  labeling  to  remove  malicious  nodes  is  shown  in  Figure  6.17.  Figure  6.16  and 
Figure  6.17  have  identical  experimental  setup  as  the  previous  one,  except  that  we  active 
five  sensor  nodes.  It  is  worth  noting  that  since  we  only  have  one  malicious  node  present 
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MSE  =  0.7090 


Figure  6.14:  Tracking  result  with  malicious  nodes.  We  activate  four  sensor  nodes,  and  there 
is  one  malicious  node  (which  can  sense  the  target)  at  t  =  10,20,30,40,50,  and  that  node 
remains  active  (malicious)  for  only  one  time  step 


X  position 


Figure  6.15:  Comparison  of  the  tracking  performance.  The  solid  line  is  obtained  by  averag¬ 
ing  the  result  in  Figure  6.14,  and  its  MSE  is  0.7090.  The  dashed  line  is  obtained  by  using 
relaxation  labeling  to  remove  malicious  nodes.  Its  MSE  is  0.3669,  which  is  smaller  than  the 
MSE  of  averaging. 
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and  we  activate  five  sensor  nodes,  averaging  has  relatively  good  results.  This  is  because 
that  the  malicious  node  is  a  minority  among  the  five  nodes.  Using  relaxation  labeling  still 
provides  a  better  MSE,  0.4081. 


MSE  =  0.8715 


X  position 


Figure  6.16:  Tracking  result  with  malicious  nodes  by  activating  five  sensor  nodes. 
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Figure  6.17:  Comparison  of  the  tracking  performance  for  five  active  nodes. 

As  a  final  demonstration,  we  activate  five  nodes  in  Figure  6.18.  In  this  experiment, 
we  choose  one  malicious  node  at  two  consecutive  time  steps.  That  is,  the  malicious  nodes 
are 

1.  and  ^2^ 

2.  §2°  a-iid 

3.  and 

4.  S4°  and 

5.  Sg®  and 

For  example,  when  we  apply  the  relaxation  labeling  algorithm  to  t  =  10  and  t  =  11, 
we  will  have  one  malicious  node  at  both  time  steps.  We  further  examine  the  probability  of 
each  node  being  malicious  at  t  =  10  and  t  =  11  in  Figure  6.19.  Figure  6.19  shows  that  we 
correctly  identify  and  to  be  malicious  nodes,  which  is  a  correct  result.  The  relaxation 
labeling  algorithm  took  about  20  iterations  to  converge,  which  is  very  fast.  There  are  other 
malicious  nodes  at  t  =  20, 30, 40, 50.  The  probabilities  of  being  malicious  for  those  nodes 
at  t  =  20,30,40,50  are  listed  in  Appendix  D  for  the  ease  of  reading.  The  final  tracking 
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result  using  relaxation  labeling  is  shown  in  Figure  6.20.  Figure  6.20  shows  that  detection 
of  malicious  nodes  are  successful  for  t  =  10, 20, 30, 40, 50. 


Figure  6.18:  Tracking  performance  under  the  influence  of  malicious  nodes.  Note  that  no 
secure  tracking  algorithm  is  performed  in  this  figure. 
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Figure  6.19:  Probability  of  being  malicious  nodes  for  the  five  nodes  at  f  =  10  and  another 
five  nodes  at  t  =  11.  Hence  we  have  5  x  2  =  10  P{m)  here.  The  first  five  probabilities  are 
for  P{\)  at  t  =  10.  The  last  five  are  for  t  =  11.  We  can  see  that  at  t  =  10,  the  first  node 
is  found  to  be  malicious,  while  at  t  =  11,  the  second  node  is  found  to  be  malicious.  This 
agrees  with  the  actual  data. 
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Figure  6.20:  Tracking  performance  under  the  influence  of  malicious  nodes.  Two  secure 
tracking  results  are  shown  in  this  figure.  One  is  averaging  (shown  in  solid  line),  and  the 
other  is  relaxation  labeling  (shown  as  the  dashed  line).  The  MSE  for  averaging  is  1.2615. 
The  MSE  for  relaxation  labeling  is  0.7331. 
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Chapter  7 

Conclusions  and  Future  Work 

7.1  Conclusions 

In  this  dissertation,  the  problems  of  secure  localization  and  secure  tracking  for 
sensor  networks  were  investigated.  Malicious  nodes  are  assumed  to  be  colluding,  and  the 
only  way  to  detect  them  is  by  their  behaviors.  A  new  relaxation  labeling  algorithm  was 
proposed  which  employs  higher-order  compatibility  functions  to  detect  malicious  nodes,  and 
then  uses  the  report  from  benign  nodes  to  perform  localization  and  tracking.  The  perfor¬ 
mance  of  the  relaxation  labeling  algorithm  has  been  demonstrated  with  both  simulations 
and  field  experiments.  The  result  of  secure  localization  has  also  been  compared  with  an 
existing  algorithm  based  on  majority  voting.  Our  algorithm  is  shown  to  be  able  to  de¬ 
tect  malicious  nodes,  even  when  they  are  colluding.  Choice  of  suitable  parameters  for  the 
relaxation  labeling  algorithm  was  also  discussed. 

7.2  Future  Work 

In  Chapter  6,  we  considered  the  target  tracking  problem  for  sensor  networks.  Tar¬ 
get  tracking,  as  explained  in  Chapter  3,  consists  of  two  parts:  prediction  and  update.  The 
update  stage  hinges  on  the  measurement  made  by  the  sensor  nodes  -  that  is,  we  can  only 
correctly  estimate  the  current  location  of  the  target  if  sensor  nodes  make  measurement  of 
the  target  range. 


A  future  research  direction  is  to  consider  that,  due  to  the  difficulty  of  the  terrain. 


no 


some  parts  of  our  monitored  area  do  not  have  sensor  nodes  deployed.  However,  the  target 

may  go  through  these  areas.  In  other  words,  our  sensor  network  will  inevitably  lose  track 

of  the  target  for  several  time  steps  when  the  target  is  within  the  difficult  domain.  For 

example,  consider  the  scenario  depicted  in  Figure  7.1  which  has  a  shaded  area  where  no 

sensor  node  can  be  deployed.  We  denote  the  shaded  area  as  the  unavailable  area.  Therefore, 

when  the  target  passes  through  the  unavailable  area,  we  cannot  make  measurement  of  the 

range  between  the  target  and  any  sensor  node.  The  problem  is  then  to  decide  which  sensor 

nodes  to  the  right  of  the  unavailable  area  (assuming  that  the  target  is  traveling  from  left 

to  right)  should  we  activate  in  order  not  to  lose  track  of  the  target? 

•  target  (actual  position) 
o  target  (possible  position) 

=  5? 

t  =  5? 

O 

5? 

t  =  5? 

O 

Figure  7.1:  A  possible  scenario  where  a  target  travels  through  some  obstacles  where  no 
sensor  nodes  are  deployed.  In  this  scenario,  the  target  is  traveling  from  the  left  to  the  right. 
The  shaded  area  is  where  there  is  no  sensor  nodes  are  deployed.  For  example,  consider 
that  the  shaded  area  is  a  river.  The  target  is  a  tank  that  we  are  tracking.  There  are  no 
sensor  nodes  deployed  in  the  river,  however,  the  tank  can  successfully  pass  through  the  river. 
Assume  that  it  takes  one  time  step  for  the  tank  to  pass  the  river,  we  cannot  determine  the 
location  of  the  tank  at  t  =  4.  The  problem  is  to  determine  which  nodes  to  the  right  of  the 
unavailable  area  should  we  activate,  at  t  =  5,  in  order  not  to  miss  the  tank? 

One  possible  approach  to  this  research  problem  is  to  consider  a  possible  radius  and 
a  possible  angle  for  the  target  to  move,  based  on  the  available  motion  model  of  the  target. 
Based  on  the  last  available  target  position,  we  may  predict  a  eandidate  area  wherein  the 
target  will  be,  at  the  immediately  following  time  step.  Depending  on  the  number  of  time 
steps  that  the  target  will  not  be  measured,  we  may  expand  this  candidate  area  based  on 
the  available  motion  model  of  the  target.  Hence  the  nodes  within  the  expanded  candidates 
area  which  are  located  beyond  the  unavailable  area  should  be  activated.  Figure  7.2  shows 


Ill 


an  example  of  a  fan-shaped  candidate  area.  Based  on  the  linear  motion  model  of  the  target, 
we  can  calculate  the  radius  of  the  fan-shaped  area  based  on  the  speed  of  the  target.  The 
angle  of  the  fan-shaped  area  is  also  determined  according  to  the  past  trajectory  of  the  target 


and  known  bounds  on  accelerations.  Hence  we  may  expand  the  candidate  area  for  all  the 
necessary  time  steps  that  we  have  no  measurement  of  the  target.  The  nodes  to  be  activated 


will  be  within  the  expanded  candidate  areas. 


Figure  7.2:  Candidate  areas  for  node  activations.  The  candidate  area  is  calculated  based 
on  an  estimated  speed  and  an  estimated  turning  angle  of  the  target.  First,  the  candidate 
area  for  t  =  4  is  calculated.  Then  the  candidate  area  for  t  =  5  is  calculated  based  on  the 
candidate  area  for  f  =  4.  Those  nodes  inside  the  candidate  area  for  t  =  5  will  be  activated 
to  detect  possible  target  appearances. 


Obviously,  the  candidate  area  will  enlarge  as  time  progresses.  If  the  target  is 
undetected  for  several  time  steps  (within  the  unavailable  area) ,  the  number  of  nodes  required 
to  be  activated  will  be  large.  Another  possible  research  approach  is  to  predict  a  maximum 
likelihood  position  of  the  target.  In  other  words,  we  try  to  extend  the  target  trajectory.  For 
example,  Jonker  et  al.  provide  an  algorithm  for  path  extension  [36] .  After  we  arrive  at  the 
maximum  likelihood  estimate  of  the  target  location  beyond  the  unavailable  area,  we  may 
turn  on  only  the  nodes  close  to  the  maximum  likelihood  target  location.  In  this  way,  we 
do  not  have  to  activate  a  large  number  of  nodes  and  waste  precious  system  resources.  One 
example  is  illustrated  in  Figure  7.3.  In  Figure  7.3,  we  try  to  extend  the  target  trajectory  and 
arrive  at  the  maximum  likelihood  target  location  at  t  =  5.  We  only  activate  nodes  within 
the  neighborhood  of  the  estimated  target  location  at  t  =  5.  The  size  of  the  neighborhood 
is  increased  or  decreased  based  on  our  confidence  of  the  estimated  target  location  at  t  =  5. 
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•  target  (actual  position) 
o  target  (possible  position) 


Figure  7.3:  Maximum  likelihood  estimates  of  the  target  locations.  We  predict  the  maximum 
likelihood  target  location  of  the  target  at  t  =  4.  Based  on  the  estimated  target  location  at 
f  =  4,  we  predict  the  maximum  likelihood  location  of  the  target  at  t  =  5.  We  only  activate 
the  nodes  within  a  certain  neighborhood  of  the  estimated  target  location  at  t  =  5.  The  size 
of  the  neighborhood  can  be  adjusted  according  to  our  conhdence  of  the  prediction  of  the 
likely  target  position. 
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Appendix  A 

Derivation  of  the  Kalman  Filter 


In  this  section  we  derive  the  Kalman  filter^  [38,  39,  4].  The  motion  model  of  the 

target  is 


xfc  ^  pfcxfc-i  +  ^k-i 


(A.l) 


where  is  a  known  system  matrix.  Meanwhile,  the  measurement  model  is 


zfc  ^  ^ 


(A.2) 


where  is  also  a  known  measurement  matrix,  ^  and  are  zero-mean,  statistically 
independent,  and  have  covariances  of  and  R^,  respectively.  That  is 


=  E 


(A.3) 


=  F 


(A.4) 


We  begin  with  the  given  pdf  which  is  assumed  Gaussian,  i.e. 


=  AA(x^-^  pfc-i|fc-i) 


(A.5) 


which  is  to  say,  we  can  parameterize  p(x 


using 


m 


fc— l|fc— 1  ^ 


(A.6) 


^Appendix  A  derived  by  C.-C.  G.  Chang,  using  notation  from  [2]  and  the  approach  described  in  [4] 
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and  covariance 

Next,  we  calculate  the  intermediate  pdf  p(x^|z^“^)  based  on  First,  Kalman 

proposes  to  predict  the  state  estimates  at  t  =  k  using 

^fc|fc-i  ^  ^  (A.8) 

On  the  other  hand,  the  transition  of  the  true  states,  x^,  is  x^  =  F^x^“^  +  v^“^.  Therefore, 
we  can  calculate  the  error  between  the  state  estimates  and  true  states  using 

gfc|fc-i  ^ 

=  (F^x^-^  + 

Hence  for  we  have 

pfc|fc-i  ^  ^  gfc|fc-i  (^.12) 

=  E  +  (A.13) 

=  E  F^e’^-^ 

=  +  (A.15) 

The  intermediate  pdf  p(x^|z^“^)  is  another  Gaussian 


(A.9) 

(A.IO) 

(ATI) 


p{^^\z}'-^-^)  =  AA(x^ 


(A.16) 


Next,  we  will  calculate  the  desired  pdf  p(x^|z^)  using  the  intermediate  pdf  in 
(A.16).  We  will  soon  derive  that  the  desired  pdf  p(x^|z^)  is  also  Gaussian 


p(x^|z^^^)  =AA(x^m^l^P^I^) 
whose  parameters  are  calculated  using 

m^lfc  =  rn^|fc-i 


(A.17) 

(A.18) 
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-pklk  _  pfc|fc— 1  _  1 


and  where 


gfc  ^  jjfcpfc|fc-i 

is  the  covariance  of  the  innovation  term  and 

_  pfc|A:— 1 

is  the  Kalman  gain. 


(A.19) 


(A.20) 


(A.21) 


The  derivation  begins  with  calculating  using  Kalman  proposes  that 

the  measurement  prediction  is  so  the  measurement  residual  (also  referred 

to  as  the  innovation  term)  is 

=  (z’^  -  (A.22) 

Kalman  proposes  to  update  the  state  predictions  using  the  intermediate  state  predic¬ 
tions,  as 


^k\k  ^  ^k\k-i  ^  j^k  (A.23) 

where  is  the  Kalman  gain.  Substituting  the  measurement  model  in  (A. 2)  into  (A.23), 
we  have 


^k\k  ^  (^z'^  - 

=  f 


To  calculate  the  covariance  of  the  error  term  we  have 

,Tl 


pfc|fc  _  jjj 

=  E 


eMe^ 


X  —  m 


(A.24) 


(A.25) 

(A.26) 


Substituting  (A.24)  into  (A.26),  we  have 
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pA:|A: 


E 


(/  - 

[(/  -  K'=H^)(x^  - 


(A.27) 


The  error  term  x^  —  ^  is  uncorrelated  with  the  measurement  noise  w^,  so  we  have 

pfc|fc  ^  r(/_K^H'=)(x^ -  K^w^' 


(/  -  K^H'=)(x^  - 


iT 


(/  -  K'^H 


kTTk\T 


+  K'^E 


K' 


p/i;|/c— 1  _  jj/cp/c|/c— 1  _  p/c|/c— 1  /  ^  f 


+K*^  (  H^p^l^-i  (h^)^  +  r)  ( ^ 


(A.28) 

(A.29) 

(A.30) 


The  sum  of  the  diagonal  elements  of  a  matrix  is  the  trace  of  a  matrix.  In  the  case  of  a  error 
covariance  matrix  in  (A.30),  the  trace  is  the  sum  of  the  mean  squared  errors.  The  Kalman 
filter  is  a  Minimum  Mean  Square  Error  (MMSE)  estimator.  The  mean  squared  error  may 
be  minimized  by  minimizing  the  trace  of  and  by  minimizing  the  trace  of  P^l^  we  can 
obtain  the  desired  K^. 


We  differentiate  P^l*^  with  respect  to  in  order  to  find  the  conditions  of  this 
minimum.  Note  that  the  trace  of  a  matrix  is  equal  to  the  trace  of  its  transpose,  hence  we 
have 


p^[pfc|fc]  ^  p^pfc|fc-ij  _  2rr[K*'H^P^I^“^]  +  Tr 


J^k  (-^kpk\k-i  f-^k\  +rWk 


(A.31) 


where  Tr[-]  is  the  trace  of  a  matrix. 
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Differentiating  with  respect  to  gives 


Trp^l^] 


=  -2  ( +  2K^  ( u^pMk-i  ( nkY  +  R 


(A.32) 

(A.33) 


we  have 


^kpk\k-iY  =  K^  ( +  R 


Solving  for  K^,  we  have 


R-fc  ^  pfclfc-i/jjfcA  [  Hfcpfc|fc-i /jjfcA 


(A.34) 


(A.35) 

(A.36) 


(A. 36)  is  the  Kalman  gain  equation  as  previously  defined  in  (A. 21).  Also,  the  definition  of 
the  covariance  of  the  innovation  term  in  (A. 37)  comes  from  (A.36) 


=  (h’^Y  (s^]  ^ 


=  -iikpk\k-i  (-^k\ 


(A.37) 


Substituting  (A.36)  into  (A. 30),  we  obtain 


pA:|A:  _  pfc|fc-l  _  pfc|fc-l  _j_  p-fcpfc|fc-l  (A. 38) 

_  pfc|fc-l  _  j^fcp-fcpfc|A:-l  (A. 39) 


(A. 39)  proves  the  error  covariance  matrix  update  equation  given  in  (A.  19).  We  illustrate 
the  workflow  of  the  Kalman  filter  estimation  process  in  Figure  A.l. 


Example.  Consider  an  illustrative  example  in  which  the  target  state  consists  only 
of  the  target  position  in  the  two-dimensional  space 


(A.40) 


and  our  motion  model  for  the  target  is  linear 


=  Fx^  +  ^  2 

0  1 


(A.41) 


Figure  A.l:  The 


Estimation 
of  the  state 


State  Covariance 
Computation 


of  the  Kalman  filter  estimation(reproduced  from  [4]) 
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The  noise  is  zero-mean,  Gaussian 

0  ^  10 
E[v^]  =  E[v^  ]  =  Q  =  (A.42) 

0  ^  ^  0  1 

For  the  measurement  model,  we  have 

0 1  r .  1  , 

-h  (A.43) 

which  is  again  linear.  The  noise  in  the  measurement  model  is  also  zero-mean,  Gaussian 

F;[w^]  =  I  ”  I  E[w^  ( w'^ )  ]  =  R  =  I  *  ”1  (A.44) 


=  Hx^  +  = 


1  0 

X 

0  1 

y 

0 

.  /  ,\T 

1  0 

0 

II 

II 

0  1 

To  initialize  the  system,  we  choose 


x0  = 

5 

m0|0  = 

5 

p0|0  ^ 

1 

0 

7 

7 

0 

1 

(A.45) 


Using  those  initial  parameters,  we  can  simulate  the  target  moment  by  generating  random 
noise  sequences.  In  this  example,  we  show  the  tracking  of  the  target  using  Kalman  filter 
for  10  iterations.  The  first  random  noise  that  we  generate  is 


V 


0 


-0.4326 

-1.6656 


Hence  we  can  calculate  the  first  location  of  the  target  as 


xi  = 

'  1  r 

5 

-0.4326 

8.0674 

0  1 

7 

-1.6656 

5.3344 

Next,  we  generate  the  first  random  measurement  noise  as 


w 


1 


1.1892 

-0.0376 


Hence  we  can  calculate  the  first  measurement  value  as 


= 

1  0 

8.0674 

1.1892 

9.2566 

0  1 

5.3344 

-0.0376 

5.2968 

(A.46) 


(A.47) 


(A.48) 


(A.49) 
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If  we  continue  this  process,  we  can  simulate  more  samples  of  the  noise  sequences  as 


V 


1 


-1.1465 

2 

0.3273 

3 

-0.5883 

4 

1.0668 

5 

0.2944 

1.1909 

V  = 

0.1746 

V  = 

2.1832 

V  = 

0.0593 

V  = 

-1.3362 

-0.6918 

7 

-1.4410 

8 

0.8156 

9 

1.1908 

10 

-1.6041 

0.8580 

V  = 

0.5711 

V  = 

0.7119 

V  = 

-1.2025 

V  = 

0.2573 

Similarly,  repeat  the  process  of  generating  measurement  noise 


w 


2 


-0.1867 

-0.1364 

4 

-0.0956 

0.7143 

w6  = 

1.2540 

= 

= 

= 

0.7258 

0.1139 

-0.8323 

1.6236 

-1.5937 

-0.3999 

8 

1.2912 

9 

-0.0198 

10 

-1.0565 

0.6900 

w  = 

0.6686 

W  = 

-0.1567 

w  = 

1.4151 

Using  these  noise  sequences,  we  can  simulate  the  evolution  of  the  state  sequence  as 


9.5882 

3 

13.1781 

4 

15.9398 

5 

21.4481 

6 

26.2138 

6.5253 

X  = 

6.7 

X  = 

8.8832 

X  = 

8.9424 

X  = 

7.6063 

29.3251 

8 

32.1163 

9 

37.4496 

10 

43.5141 

8.4643 

X  = 

9.0354 

X  = 

9.7473 

X  = 

8.5449 

(A.50) 


Next,  we  repeat  the  process  of  generating  measurements  using  (A. 43),  and  we  have 


9.4015 

3 

13.0417 

4 

15.8441 

5 

22.1625 

6 

27.4678 

7.2511 

Z  = 

6.8139 

Z  = 

8.0508 

Z  = 

10.5660 

Z  = 

6.0125 

28.9252 

n.8 

33.4065 

n.9 

37.4298 

n.10 

42.4576 

9.1542 

z  = 

9.7040 

Z  = 

9.5906 

z  = 

9.96 

(A.51) 


At  this  point  we  have  built  a  system  in  which  the  system  is  moving  according  to  the  state 
sequences  in  (A.50)  and  the  sensor  measurements  are  in  (A.51). 
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Now  we  begin  to  perform  target  tracking  using  the  Kalman  filter.  First,  using 
(A. 8),  we  update  the  intermediate  as 

=  Fm°l°  =  ^  2  5  ^  8.5 

0  17  7.0 

For  we  use  (A. 15)  and  we  have 

iln  nin  'T  1  ^  1  0  1  0  1  0  2.25  0.5 

pi|0  ^  fP°|0F^  +  Q  =2  ^  ^ 

0  1  0  1  1  1  0  1  0.5  2 

The  next  two  items  to  be  calculated  in  the  Kalman  filter  process  are  the  innovation  term  S 
and  the  Kalman  gain  K.  We  use  (A. 37)  to  calculate  the  innovation  term  and  obtain 


.  ^  1  0  2.25  0.5 

gi  =HP00H^  +  R  = 

01  0.5  2 

To  calculate  the  Kalman  gain,  we  use  (A. 21)  and  we  have 

Ri  =  pi|0H^  (gi)-!  = 


1 

0 

+ 

1 

0 

0 

1 

0 

1 

3.25  0.5 
0.5  3 


2.25  0.5 

1  0 

0.3158  -0.0526 

0.6842  0.0526 

0.5  2 

0  1 

-0.0526  0.3421 

0.0526  0.6579 

The  final  two  items  that  we  seek  are  the  state  estimate  and  its  error  covariance  P^l^. 
We  already  have  K^,  and  and  substitute  them  into  (A. 18),  we  have 

=  m^l°  +  K^(zi -Hm^l°) 


8.5 

0.6842 

0.0526 

/ 

7.3757 

+ 

7.0 

0.0526 

0.6579 

V 

6.1924 

1  0 
0  1 


8.5 

7.0 


8.9280 

5.9193 


(A.54) 


This  is  our  estimate  of  the  target  state  at  /c  =  1.  The  true  state  of  the  target  at  /c  =  1  is 
(A. 47).  The  final  item  that  we  need  to  calculate  in  the  Kalman  filter  process  is  P^l^.  We 
use  (A. 19)  to  obtain 

pl|l  _  pl|0  _  Rijjpi|o 


2.25  0.5 

0.6842  0.0526 

1  0 

2.25  0.5 

0.5  2 

0.0526  0.6579 

0  1 

0.5  2 

0.6842  0.0526 
0.0526  0.6579 


(A.55) 
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Thus  we  complete  one  iteration  of  the  Kalman  filter  process.  If  we  continue  the  same 
process,  and  calculate  the  six  items  S^,  K^,  and  in  the  same 

manner,  we  have 


m2|2  = 

10.3420 

m3|3  = 

13.2682 

m"|4  = 

16.1874 

6.615 

6.7068 

7.4895 

m5|5  = 

21.5138 

m6|6  = 

26.8649 

m"l^  = 

29.5995 

9.4754 

7.4208 

8.3988 

m8|8  = 

33.6102 

m9|9  = 

37.7248 

42.4720 

9.1764 

9.3920 

9.7404 

(A.56) 


We  show  the  actual  state  sequence,  x^,  /c  =  0,  •  •  •  ,  10  and  the  estimated  state  sequence, 
m^lfc,  =  0,  •  •  •  ,  10  in  Figure  A. 2. 


Figure  A. 2:  Tracking  example  using  the  Kalman  filter 
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Appendix  B 

Target  Tracking  Using  Particle 
Filter 


In  this  appendix,  we  provide  a  target  tracking  example.  The  state  of  the  target, 
X,  evolves  based  on  the  following  motion  model 

+  h  +  (B.l) 

where  is  the  unpredictable  disturbances  during  the  traversal  of  the  target.  Note  that 
in  this  example,  the  target  state,  x,  is  one-dimensional.  The  variance  of  is  1.0.  At  the 
beginning,  when  time  step  k  =  0,  x^  =  0.1. 

Our  measurement  z  is  related  to  the  target  state,  x,  according  to  the  following 
nonlinear  measurement  model 

(^k\2 

=  ^  +  (B.2) 

where  is  the  measurement  noise.  Note  that  does  not  necessarily  model  the  measure¬ 
ment  noise  of  a  sensor  node.  Here,  we  provide  a  general  target  tracking  problem,  where 
(B.l)  models  any  general  target  motion  model  and  (B.2)  models  any  measurement  model. 

The  motion  model  is  apparently  similar  to  a  straight  line  in  (B.l).  However,  the 
measurement  model  in  (B.2)  is  nonlinear.  The  particle  filter  algorithm  is  designed  to  solve 
nonlinear  tracking  problems  like  this.  To  demonstrate  the  correctness  of  the  particle  filter 
algorithm,  we  set  the  variance  of  the  measurement  noise,  to  be  1.0  x  10“®  first.  The 
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number  of  samples,  Ng  =  1000  in  this  appendix.  The  evolution  of  the  target  state,  x^,  and 
the  estimated  target  state  using  particle  filter  is  shown  in  Figure  B.l.  In  this  experiment, 
the  (simulated)  true  target  states  are  marked  as  circles,  and  we  can  see  that  the  target  mo¬ 
tion  has  some  fluctuations,  since  the  noise  variance  is  nonzero.  The  estimated  target  states, 
using  particle  filter,  are  marked  as  stars.  We  can  see  from  Figure  B.l  that  particle  filter  is 
very  successful  in  tracking  the  target,  even  thought  measurement  model  is  nonlinear.  The 
mean  squared  error  (MSE)  between  the  true  target  states  and  the  estimated  target  states, 
over  20  time  steps,  is  7.5  x  10“^. 


MSE  =  7.51  >10''' 


Figure  B.l:  Target  tracking  result  using  particle  filters.  The  true  target  states  are  marked 
as  circles,  while  the  estimated  target  states  are  marked  as  stars.  The  variance  of  is  1.0, 
while  the  variance  of  is  1.0  x  10“^.  The  tracking  result  is  correct  over  20  time  steps. 

As  another  example,  we  use  the  same  motion  model  in  (B.l)  and  the  same  mea¬ 
surement  model  in  (B.2)  to  perform  another  tracking  experiment  with  the  variance  of  the 
measurement  noise,  w^,  equalling  1.0.  Again,  we  illustrate  the  true  target  states  (circles) 
and  the  estimated  target  states  (stars)  on  the  same  figure  in  Figure  B.2.  The  tracking 
result  in  shown  in  Figure  B.2.  The  MSE  is  0.1302,  which  is  much  larger  than  the  previous 
experiment.  We  can  also  visibly  see  the  mismatch  between  the  true  target  states  and  the 
estimated  target  states. 
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MSE  =  0.1302 


Figure  B.2:  Target  tracking  result  using  particle  filters.  The  variance  of  is  1.0,  while  the 
variance  of  is  1.0.  We  can  see  that  at  time  step  k  =  1  and  k  =  2,  there  exists  some 
distinguishable  tracking  error. 
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Appendix  C 

Relaxation  Labeling  as  an 
Optimization  Process 


C.l  Introduction 


Appendix  C  is  a  revision  of  [49] . 


Consider  a  system  of  N  objects  and  M  labels.  Let  Pi{Xj)  denote  the  “confidence” 
that  object  i  has  label  Xj.  The  confidence  Pi{Xj)  has  probability-like  properties: 


M 


i=i 

Following  [51],  we  iteratively  update  the  confidence  of  node  i  having  label  j  as 


(C.l) 


Pl{\,)  [l+qjiX,]] 


Dj 


(C.2) 


where  D\  =  P/(Afc)  [l  -|-  ql{Xk)] ,  /c  =  1,  •  •  •  ,  A  is  a  normalization  required  to  ensure  that 
Pi{Xj)  sums  to  1,  and  t  stands  for  iteration.  q\{Xj)  is  the  “support”  of  object  i  having  label 


\j.  More  details  follow. 


The  support  is  defined  as 

<liW  = 


(C.3) 
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where  the  superscript  for  iteration  is  omitted  for  clarity.  The  Rij{\,ii)  in  (C.3)  is  the 
compatibility  function  of  objects  i  and  j  having  label  A  and  //,  respectively.  The  design 
of  the  compatibility  function  is  problem-dependent,  and  we  require  that  —  1  <  Rij{X,  ^)  <  1. 


For  brevity,  we  define  Qi{X)  =  [l  -|-  qj{Xj)] ,  and  (C.2)  becomes 

*  ^  E,pmQ^{^^y 


c.2  Why  Relaxation? 


(C.4) 


Our  objective  is  to  show  that  the  relaxation  labeling  algorithm  can  be  formulated 
as  an  optimization  problem,  and  the  relaxation  labeling  algorithm  is  indeed  a  type  of  relax¬ 
ation  method.  Relaxation  is  an  algorithm  that  can  be  found  in  areas  such  as  optimization 
and  differential  equations.  We  define  the  “feasible  region”,  P,  to  be  the  collection  of  points 

P  =  [Pi(Ai),  Pi(A2),  •  •  •  ,  P2(Ai),  P2(A2),  •  •  •  ,  PN{XM)f  (C.5) 

where  each  point  in  P  satisfies  the  criteria  in  (C.l).  For  example,  0  <  Pi(A2)  <  1,  and 
Pi{Xj)  =  1.  Denote  the  objective  function  that  we  wish  to  minimize  as  F(P).  We 
iteratively  replace  the  current  point  P*  with  a  more  “relaxed”  point  P*'*'^,  and  eventually 
minimize  F(P).  A  relaxation  method  has  two  criteria: 

1.  P*+i  contains  P* 

2.  F(P*+i)  ^  F(P*) 

Now  we  demonstrate  that  relaxation  labeling  can  be  formulated  as  an  optimization 
problem.  First,  the  objective  function  can  be  formulated  as 


^^*(A)Q,(A)  (C.6) 

i  A 

and  subject  to  the  conditions  given  in  (C.l) 

0  <  PiiXj)  <  1  (C.7) 

M 

i=i 


(C.8) 
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(the  reason  for  choosing  this  objective  function  is  explained  in  the  next  section).  We  can 
observe  from  (C.4)  that  relaxation  labeling  is  an  iterative  process,  and  we  can  rewrite  (C.4) 
as  a  gradient  method 

P^+\X)  =  P^{X)  +  MW  (C.9) 


where 


and 


(C.IO) 


=  (C.ll) 

To  satisfy  the  first  criteria  of  a  relaxation  method,  we  require  that  P^~^^{X)  also 
satisfies  the  conditions  given  in  (C.l).  Since  P^~^^{X)  =  Pf{X)  +  r*(A),  we  have 

M  MM 

(C.12) 

j=i  j=i  j=i 

Since  Ylf=i  =  1  and  Ylf=i  Pfi^j)  =  1)  have 


M 


Eri(Aj)  =  0  i  =  ,N 

j  =  l 

(C.13) 

and 

ri{Xj)  ^0  if  Pi{Xj)  =  0 

(C.14) 

To  satisfy  the  second  criteria  of  a  relaxation  method,  we  have 

F(P*+i)  ^  F(P*) 

(C.15) 

^  ^(pt+i)  _^(pt)  ^  0 

_  F(P*+i)-F(P*)  ^ 

At 

(C.16) 

(C.17) 

dF(P)  „ 

^  dt  ^ 

^  dF{P)  dP{X,)  ^ 

dPi{Xj)  dt  ^ 

(C.18) 

(C.19) 

Denoting  \/F  = 

have 

■  dF{P) 
[dPiiXj)  ■  ■  -J 

and  observe  that  r  in  (C.12)  is  the  change  of  P  over  time,  we 

V  P  •  r  ^  0 

(C.20) 

where 

r  =  [n(Ai),ri(A2),  •  •  •  ,  r2(Ai),  r2(A2),  •  •  •  ,rN{XM)]^ 

(C.21) 
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C.3  Design  of  the  Objective  Function 

In  summary,  the  relaxation  labeling  process  can  be  formulated  as  the  following 
optimization  problem 


minimize  ^(p)  =  -^EE  P^{^)Qi{X)  (C.22) 

i  A 

subject  to  0  <  Pi{Xj)  <1  i  =  1,  ■  ■  ■  ,  N  (C.23) 

M 

and  Pi{Xj)  =  1  i  =  1,  ■  ■  ■  ,  N  j  =  1,  ■  ■  ■  ,  M  (C.24) 

The  relaxation  labeling  process  is  a  gradient  method  to  find  r  satisfying 


Pt\X) 

M 

= 

Pi{X)  +  r\{X) 

(C.25) 

X]d(Aj) 

1=1 

n{Xj) 

= 

0  i  =  1, • •  •  , iV 

(C.26) 

0  if  Pi{Xj)  =  0 

(C.27) 

S/P  ■  r 

0 

(C.28) 

The  reason  for  choosing  the  objective  function  in  (C.22)  is  that  inconsistency  is 
defined  as  the  difference  between  Pi{X)  and  Qi{X)  [24],  Intuitively,  Pi{X)  is  what  every 
object  “thinks”  about  its  own  labeling,  and  Qi{X)  is  what  its  neighbors  “think”  about  it  [5]. 
Hence  if  we  maximize  the  following  function  [5] 

sEE  P^{X)Q^{X)  (C.29) 

i  A 

the  inconsistency  will  be  minimized,  since  (C.29)  is  maximized  when  Pi{X)  =  Qi{X)  (minimal 
inconsistency).  To  maximize  (C.29)  is  equivalent  to  minimizing  (C.22).  The  ^  in  (C.22)  is 
to  scale  Qi{X)  to  be  between  0  and  1,  that  is,  0  <  ^Qi{X)  <  1. 

C.4  Proof  of  Relaxation  Labeling  as  an  Optimization  Process 

We  need  to  prove  that  ri{X)  in  (C.IO)  satisfies  (C.26),  (C.27)  and  (C.28).  Denote 
X  =  have 
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A  A 

ExP^WQ^W  Exim)^] 

X  X 

X  XEa-P^(A) 

X  X 

XX 

X  “  X 

=  1-1 


(C.30) 

(C.31) 

(C.32) 

(C.33) 

(C.34) 

(C.35) 


and  if  Pj(A)  =  0  then  ri{X)  =  0,  and  thus  equations  (C.26)  and  (C.27)  are  satisfied. 


Next  we  check  equation  (C.28).  Now  we  use  different  indexes  in  (C.6) 


Fip)  =  -lY.Y.p>^(x)Qk{Xi) 

(C.36) 

k  Xi 

and  we  have 

^  dF 

~  dPiW 

(C.37) 

k  1 

dPkiXi)  .dQk{Xi 

_  dPi{X)  ^  dPi{X) 

(C.38) 

=  4ee 

k  1 

Qk{Xi)  +  Pk{Xi)  ^  ^  dP  (X)  Pkj{Xi,fi)Pj{n) 

j  F  ) 

(C.39) 

=  “2  ^  ^  [Qki^l)  +  Pk{^l)Rki{k^  ^)] 

k  1 

(C.40) 

k  1 

(C.41) 

= 

(C.42) 

=  -Q 

(C.43) 

So 

V  F -r  = 

-'^'^niX)Qi{X) 

(C.44) 

i  A 


i  X 


(C.45) 
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Furthermore, 


Y,m)Q^w  [Q*(A)-x] 

A 


=  ^Pi(A)[Q,(A)-X]2  +  X^Pi(A)[gi(A)-X]  (C.46) 

A  A 

=  ^  Pi{X)  [Q,(A)  -  X]2  +  X  ^  -  X  ^  P,(A)X 

A  A  A 

=  ^Pi(A)[Q,(A)-X]2  +  x2-X2^Pi(A)  (C.47) 

A  A 

=  ^  P,(A)  [Q,(A)  -  X]2  +  X2  -  X2  (C.48) 

A 

=  J;p,(A)[Q,(A)-X]2  (C.49) 

A 

^  0  (C.50) 


and  since  X  ^  0,  this  implies  that  \jF  •  r  ^  0  and  (C.28)  is  satisfied. 
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Appendix  D 

Probability  of  Being  Malicious  for 
Nodes  in  Secure  Tracking 


Here  we  have  the  probabilities  for  all  the  nodes  in  the  last  secure  tracking  experi¬ 
ment  in  Section  6.3.3. 


Figure  D.l:  Probability  of  being  malicious  nodes  for  the  five  nodes  at  t  =  20  and  another 
five  nodes  at  t  =  21.  The  malicious  nodes  are  and  which  agrees  with  what  we  have 
found  here. 
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Figure  D.2:  Probability  of  being  malicious  nodes  for  the  five  nodes  at  t  =  30  and  another 
hve  nodes  at  t  =  31.  The  malicious  nodes  are  and  which  agrees  with  what  we  have 
found  here. 


Figure  D.3:  Probability  of  being  malicious  nodes  for  the  five  nodes  at  t  =  40  and  another 
five  nodes  at  t  =  41.  The  malicious  nodes  are  and  which  agrees  with  what  we  have 
found  here. 
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Figure  D.4:  Probability  of  being  malicious  nodes  for  the  five  nodes  at  t  =  50  and  another 
five  nodes  at  t  =  51.  The  malicious  nodes  are  and  which  agrees  with  what  we  have 
found  here. 


